You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

548 lines
17 KiB

7 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
6 years ago
7 years ago
7 years ago
7 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
7 years ago
8 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
8 years ago
8 years ago
8 years ago
7 years ago
7 years ago
7 years ago
7 years ago
8 years ago
8 years ago
7 years ago
8 years ago
8 years ago
WAL: better errors and new fail point (#3246) * privval: more info in errors * wal: change Debug logs to Info * wal: log and return error on corrupted wal instead of panicing * fail: Exit right away instead of sending interupt * consensus: FAIL before handling our own vote allows to replicate #3089: - run using `FAIL_TEST_INDEX=0` - delete some bytes from the end of the WAL - start normally Results in logs like: ``` I[2019-02-03|18:12:58.225] Searching for height module=consensus wal=/Users/ethanbuchman/.tendermint/data/cs.wal/wal height=1 min=0 max=0 E[2019-02-03|18:12:58.225] Error on catchup replay. Proceeding to start ConsensusState anyway module=consensus err="failed to read data: EOF" I[2019-02-03|18:12:58.225] Started node module=main nodeInfo="{ProtocolVersion:{P2P:6 Block:9 App:1} ID_:35e87e93f2e31f305b65a5517fd2102331b56002 ListenAddr:tcp://0.0.0.0:26656 Network:test-chain-J8JvJH Version:0.29.1 Channels:4020212223303800 Moniker:Ethans-MacBook-Pro.local Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}" E[2019-02-03|18:12:58.226] Couldn't connect to any seeds module=p2p I[2019-02-03|18:12:59.229] Timed out module=consensus dur=998.568ms height=1 round=0 step=RoundStepNewHeight I[2019-02-03|18:12:59.230] enterNewRound(1/0). Current: 1/0/RoundStepNewHeight module=consensus height=1 round=0 I[2019-02-03|18:12:59.230] enterPropose(1/0). Current: 1/0/RoundStepNewRound module=consensus height=1 round=0 I[2019-02-03|18:12:59.230] enterPropose: Our turn to propose module=consensus height=1 round=0 proposer=AD278B7767B05D7FBEB76207024C650988FA77D5 privValidator="PrivValidator{AD278B7767B05D7FBEB76207024C650988FA77D5 LH:1, LR:0, LS:2}" E[2019-02-03|18:12:59.230] enterPropose: Error signing proposal module=consensus height=1 round=0 err="Error signing proposal: Step regression at height 1 round 0. Got 1, last step 2" I[2019-02-03|18:13:02.233] Timed out module=consensus dur=3s height=1 round=0 step=RoundStepPropose I[2019-02-03|18:13:02.233] enterPrevote(1/0). Current: 1/0/RoundStepPropose module=consensus I[2019-02-03|18:13:02.233] enterPrevote: ProposalBlock is nil module=consensus height=1 round=0 E[2019-02-03|18:13:02.234] Error signing vote module=consensus height=1 round=0 vote="Vote{0:AD278B7767B0 1/00/1(Prevote) 000000000000 000000000000 @ 2019-02-04T02:13:02.233897Z}" err="Error signing vote: Conflicting data" ``` Notice the EOF, the step regression, and the conflicting data. * wal: change errors to be DataCorruptionError * exit on corrupt WAL * fix log * fix new line
6 years ago
8 years ago
7 years ago
7 years ago
7 years ago
7 years ago
7 years ago
7 years ago
8 years ago
7 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
7 years ago
7 years ago
7 years ago
7 years ago
7 years ago
8 years ago
7 years ago
8 years ago
8 years ago
8 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
6 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
  1. package consensus
  2. import (
  3. "bytes"
  4. "fmt"
  5. "hash/crc32"
  6. "io"
  7. "reflect"
  8. //"strconv"
  9. //"strings"
  10. "time"
  11. abci "github.com/tendermint/tendermint/abci/types"
  12. //auto "github.com/tendermint/tendermint/libs/autofile"
  13. dbm "github.com/tendermint/tm-db"
  14. "github.com/tendermint/tendermint/libs/log"
  15. "github.com/tendermint/tendermint/mock"
  16. "github.com/tendermint/tendermint/proxy"
  17. sm "github.com/tendermint/tendermint/state"
  18. "github.com/tendermint/tendermint/types"
  19. "github.com/tendermint/tendermint/version"
  20. )
  21. var crc32c = crc32.MakeTable(crc32.Castagnoli)
  22. // Functionality to replay blocks and messages on recovery from a crash.
  23. // There are two general failure scenarios:
  24. //
  25. // 1. failure during consensus
  26. // 2. failure while applying the block
  27. //
  28. // The former is handled by the WAL, the latter by the proxyApp Handshake on
  29. // restart, which ultimately hands off the work to the WAL.
  30. //-----------------------------------------
  31. // 1. Recover from failure during consensus
  32. // (by replaying messages from the WAL)
  33. //-----------------------------------------
  34. // Unmarshal and apply a single message to the consensus state as if it were
  35. // received in receiveRoutine. Lines that start with "#" are ignored.
  36. // NOTE: receiveRoutine should not be running.
  37. func (cs *ConsensusState) readReplayMessage(msg *TimedWALMessage, newStepSub types.Subscription) error {
  38. // Skip meta messages which exist for demarcating boundaries.
  39. if _, ok := msg.Msg.(EndHeightMessage); ok {
  40. return nil
  41. }
  42. // for logging
  43. switch m := msg.Msg.(type) {
  44. case types.EventDataRoundState:
  45. cs.Logger.Info("Replay: New Step", "height", m.Height, "round", m.Round, "step", m.Step)
  46. // these are playback checks
  47. ticker := time.After(time.Second * 2)
  48. if newStepSub != nil {
  49. select {
  50. case stepMsg := <-newStepSub.Out():
  51. m2 := stepMsg.Data().(types.EventDataRoundState)
  52. if m.Height != m2.Height || m.Round != m2.Round || m.Step != m2.Step {
  53. return fmt.Errorf("RoundState mismatch. Got %v; Expected %v", m2, m)
  54. }
  55. case <-newStepSub.Cancelled():
  56. return fmt.Errorf("Failed to read off newStepSub.Out(). newStepSub was cancelled")
  57. case <-ticker:
  58. return fmt.Errorf("Failed to read off newStepSub.Out()")
  59. }
  60. }
  61. case msgInfo:
  62. peerID := m.PeerID
  63. if peerID == "" {
  64. peerID = "local"
  65. }
  66. switch msg := m.Msg.(type) {
  67. case *ProposalMessage:
  68. p := msg.Proposal
  69. cs.Logger.Info("Replay: Proposal", "height", p.Height, "round", p.Round, "header",
  70. p.BlockID.PartsHeader, "pol", p.POLRound, "peer", peerID)
  71. case *BlockPartMessage:
  72. cs.Logger.Info("Replay: BlockPart", "height", msg.Height, "round", msg.Round, "peer", peerID)
  73. case *VoteMessage:
  74. v := msg.Vote
  75. cs.Logger.Info("Replay: Vote", "height", v.Height, "round", v.Round, "type", v.Type,
  76. "blockID", v.BlockID, "peer", peerID)
  77. }
  78. cs.handleMsg(m)
  79. case timeoutInfo:
  80. cs.Logger.Info("Replay: Timeout", "height", m.Height, "round", m.Round, "step", m.Step, "dur", m.Duration)
  81. cs.handleTimeout(m, cs.RoundState)
  82. default:
  83. return fmt.Errorf("Replay: Unknown TimedWALMessage type: %v", reflect.TypeOf(msg.Msg))
  84. }
  85. return nil
  86. }
  87. // Replay only those messages since the last block. `timeoutRoutine` should
  88. // run concurrently to read off tickChan.
  89. func (cs *ConsensusState) catchupReplay(csHeight int64) error {
  90. // Set replayMode to true so we don't log signing errors.
  91. cs.replayMode = true
  92. defer func() { cs.replayMode = false }()
  93. // Ensure that #ENDHEIGHT for this height doesn't exist.
  94. // NOTE: This is just a sanity check. As far as we know things work fine
  95. // without it, and Handshake could reuse ConsensusState if it weren't for
  96. // this check (since we can crash after writing #ENDHEIGHT).
  97. //
  98. // Ignore data corruption errors since this is a sanity check.
  99. gr, found, err := cs.wal.SearchForEndHeight(csHeight, &WALSearchOptions{IgnoreDataCorruptionErrors: true})
  100. if err != nil {
  101. return err
  102. }
  103. if gr != nil {
  104. if err := gr.Close(); err != nil {
  105. return err
  106. }
  107. }
  108. if found {
  109. return fmt.Errorf("WAL should not contain #ENDHEIGHT %d", csHeight)
  110. }
  111. // Search for last height marker.
  112. //
  113. // Ignore data corruption errors in previous heights because we only care about last height
  114. gr, found, err = cs.wal.SearchForEndHeight(csHeight-1, &WALSearchOptions{IgnoreDataCorruptionErrors: true})
  115. if err == io.EOF {
  116. cs.Logger.Error("Replay: wal.group.Search returned EOF", "#ENDHEIGHT", csHeight-1)
  117. } else if err != nil {
  118. return err
  119. }
  120. if !found {
  121. return fmt.Errorf("Cannot replay height %d. WAL does not contain #ENDHEIGHT for %d", csHeight, csHeight-1)
  122. }
  123. defer gr.Close() // nolint: errcheck
  124. cs.Logger.Info("Catchup by replaying consensus messages", "height", csHeight)
  125. var msg *TimedWALMessage
  126. dec := WALDecoder{gr}
  127. LOOP:
  128. for {
  129. msg, err = dec.Decode()
  130. switch {
  131. case err == io.EOF:
  132. break LOOP
  133. case IsDataCorruptionError(err):
  134. cs.Logger.Error("data has been corrupted in last height of consensus WAL", "err", err, "height", csHeight)
  135. return err
  136. case err != nil:
  137. return err
  138. }
  139. // NOTE: since the priv key is set when the msgs are received
  140. // it will attempt to eg double sign but we can just ignore it
  141. // since the votes will be replayed and we'll get to the next step
  142. if err := cs.readReplayMessage(msg, nil); err != nil {
  143. return err
  144. }
  145. }
  146. cs.Logger.Info("Replay: Done")
  147. return nil
  148. }
  149. //--------------------------------------------------------------------------------
  150. // Parses marker lines of the form:
  151. // #ENDHEIGHT: 12345
  152. /*
  153. func makeHeightSearchFunc(height int64) auto.SearchFunc {
  154. return func(line string) (int, error) {
  155. line = strings.TrimRight(line, "\n")
  156. parts := strings.Split(line, " ")
  157. if len(parts) != 2 {
  158. return -1, errors.New("Line did not have 2 parts")
  159. }
  160. i, err := strconv.Atoi(parts[1])
  161. if err != nil {
  162. return -1, errors.New("Failed to parse INFO: " + err.Error())
  163. }
  164. if height < i {
  165. return 1, nil
  166. } else if height == i {
  167. return 0, nil
  168. } else {
  169. return -1, nil
  170. }
  171. }
  172. }*/
  173. //---------------------------------------------------
  174. // 2. Recover from failure while applying the block.
  175. // (by handshaking with the app to figure out where
  176. // we were last, and using the WAL to recover there.)
  177. //---------------------------------------------------
  178. type Handshaker struct {
  179. stateDB dbm.DB
  180. initialState sm.State
  181. store sm.BlockStore
  182. eventBus types.BlockEventPublisher
  183. genDoc *types.GenesisDoc
  184. logger log.Logger
  185. nBlocks int // number of blocks applied to the state
  186. }
  187. func NewHandshaker(stateDB dbm.DB, state sm.State,
  188. store sm.BlockStore, genDoc *types.GenesisDoc) *Handshaker {
  189. return &Handshaker{
  190. stateDB: stateDB,
  191. initialState: state,
  192. store: store,
  193. eventBus: types.NopEventBus{},
  194. genDoc: genDoc,
  195. logger: log.NewNopLogger(),
  196. nBlocks: 0,
  197. }
  198. }
  199. func (h *Handshaker) SetLogger(l log.Logger) {
  200. h.logger = l
  201. }
  202. // SetEventBus - sets the event bus for publishing block related events.
  203. // If not called, it defaults to types.NopEventBus.
  204. func (h *Handshaker) SetEventBus(eventBus types.BlockEventPublisher) {
  205. h.eventBus = eventBus
  206. }
  207. // NBlocks returns the number of blocks applied to the state.
  208. func (h *Handshaker) NBlocks() int {
  209. return h.nBlocks
  210. }
  211. // TODO: retry the handshake/replay if it fails ?
  212. func (h *Handshaker) Handshake(proxyApp proxy.AppConns) error {
  213. // Handshake is done via ABCI Info on the query conn.
  214. res, err := proxyApp.Query().InfoSync(proxy.RequestInfo)
  215. if err != nil {
  216. return fmt.Errorf("Error calling Info: %v", err)
  217. }
  218. blockHeight := res.LastBlockHeight
  219. if blockHeight < 0 {
  220. return fmt.Errorf("Got a negative last block height (%d) from the app", blockHeight)
  221. }
  222. appHash := res.LastBlockAppHash
  223. h.logger.Info("ABCI Handshake App Info",
  224. "height", blockHeight,
  225. "hash", fmt.Sprintf("%X", appHash),
  226. "software-version", res.Version,
  227. "protocol-version", res.AppVersion,
  228. )
  229. // Set AppVersion on the state.
  230. if h.initialState.Version.Consensus.App != version.Protocol(res.AppVersion) {
  231. h.initialState.Version.Consensus.App = version.Protocol(res.AppVersion)
  232. sm.SaveState(h.stateDB, h.initialState)
  233. }
  234. // Replay blocks up to the latest in the blockstore.
  235. _, err = h.ReplayBlocks(h.initialState, appHash, blockHeight, proxyApp)
  236. if err != nil {
  237. return fmt.Errorf("error on replay: %v", err)
  238. }
  239. h.logger.Info("Completed ABCI Handshake - Tendermint and App are synced",
  240. "appHeight", blockHeight, "appHash", fmt.Sprintf("%X", appHash))
  241. // TODO: (on restart) replay mempool
  242. return nil
  243. }
  244. // ReplayBlocks replays all blocks since appBlockHeight and ensures the result
  245. // matches the current state.
  246. // Returns the final AppHash or an error.
  247. func (h *Handshaker) ReplayBlocks(
  248. state sm.State,
  249. appHash []byte,
  250. appBlockHeight int64,
  251. proxyApp proxy.AppConns,
  252. ) ([]byte, error) {
  253. storeBlockHeight := h.store.Height()
  254. stateBlockHeight := state.LastBlockHeight
  255. h.logger.Info(
  256. "ABCI Replay Blocks",
  257. "appHeight",
  258. appBlockHeight,
  259. "storeHeight",
  260. storeBlockHeight,
  261. "stateHeight",
  262. stateBlockHeight)
  263. // If appBlockHeight == 0 it means that we are at genesis and hence should send InitChain.
  264. if appBlockHeight == 0 {
  265. validators := make([]*types.Validator, len(h.genDoc.Validators))
  266. for i, val := range h.genDoc.Validators {
  267. validators[i] = types.NewValidator(val.PubKey, val.Power)
  268. }
  269. validatorSet := types.NewValidatorSet(validators)
  270. nextVals := types.TM2PB.ValidatorUpdates(validatorSet)
  271. csParams := types.TM2PB.ConsensusParams(h.genDoc.ConsensusParams)
  272. req := abci.RequestInitChain{
  273. Time: h.genDoc.GenesisTime,
  274. ChainId: h.genDoc.ChainID,
  275. ConsensusParams: csParams,
  276. Validators: nextVals,
  277. AppStateBytes: h.genDoc.AppState,
  278. }
  279. res, err := proxyApp.Consensus().InitChainSync(req)
  280. if err != nil {
  281. return nil, err
  282. }
  283. if stateBlockHeight == 0 { //we only update state when we are in initial state
  284. // If the app returned validators or consensus params, update the state.
  285. if len(res.Validators) > 0 {
  286. vals, err := types.PB2TM.ValidatorUpdates(res.Validators)
  287. if err != nil {
  288. return nil, err
  289. }
  290. state.Validators = types.NewValidatorSet(vals)
  291. state.NextValidators = types.NewValidatorSet(vals)
  292. } else if len(h.genDoc.Validators) == 0 {
  293. // If validator set is not set in genesis and still empty after InitChain, exit.
  294. return nil, fmt.Errorf("validator set is nil in genesis and still empty after InitChain")
  295. }
  296. if res.ConsensusParams != nil {
  297. state.ConsensusParams = state.ConsensusParams.Update(res.ConsensusParams)
  298. }
  299. sm.SaveState(h.stateDB, state)
  300. }
  301. }
  302. // First handle edge cases and constraints on the storeBlockHeight.
  303. switch {
  304. case storeBlockHeight == 0:
  305. assertAppHashEqualsOneFromState(appHash, state)
  306. return appHash, nil
  307. case storeBlockHeight < appBlockHeight:
  308. // the app should never be ahead of the store (but this is under app's control)
  309. return appHash, sm.ErrAppBlockHeightTooHigh{CoreHeight: storeBlockHeight, AppHeight: appBlockHeight}
  310. case storeBlockHeight < stateBlockHeight:
  311. // the state should never be ahead of the store (this is under tendermint's control)
  312. panic(fmt.Sprintf("StateBlockHeight (%d) > StoreBlockHeight (%d)", stateBlockHeight, storeBlockHeight))
  313. case storeBlockHeight > stateBlockHeight+1:
  314. // store should be at most one ahead of the state (this is under tendermint's control)
  315. panic(fmt.Sprintf("StoreBlockHeight (%d) > StateBlockHeight + 1 (%d)", storeBlockHeight, stateBlockHeight+1))
  316. }
  317. var err error
  318. // Now either store is equal to state, or one ahead.
  319. // For each, consider all cases of where the app could be, given app <= store
  320. if storeBlockHeight == stateBlockHeight {
  321. // Tendermint ran Commit and saved the state.
  322. // Either the app is asking for replay, or we're all synced up.
  323. if appBlockHeight < storeBlockHeight {
  324. // the app is behind, so replay blocks, but no need to go through WAL (state is already synced to store)
  325. return h.replayBlocks(state, proxyApp, appBlockHeight, storeBlockHeight, false)
  326. } else if appBlockHeight == storeBlockHeight {
  327. // We're good!
  328. assertAppHashEqualsOneFromState(appHash, state)
  329. return appHash, nil
  330. }
  331. } else if storeBlockHeight == stateBlockHeight+1 {
  332. // We saved the block in the store but haven't updated the state,
  333. // so we'll need to replay a block using the WAL.
  334. switch {
  335. case appBlockHeight < stateBlockHeight:
  336. // the app is further behind than it should be, so replay blocks
  337. // but leave the last block to go through the WAL
  338. return h.replayBlocks(state, proxyApp, appBlockHeight, storeBlockHeight, true)
  339. case appBlockHeight == stateBlockHeight:
  340. // We haven't run Commit (both the state and app are one block behind),
  341. // so replayBlock with the real app.
  342. // NOTE: We could instead use the cs.WAL on cs.Start,
  343. // but we'd have to allow the WAL to replay a block that wrote it's #ENDHEIGHT
  344. h.logger.Info("Replay last block using real app")
  345. state, err = h.replayBlock(state, storeBlockHeight, proxyApp.Consensus())
  346. return state.AppHash, err
  347. case appBlockHeight == storeBlockHeight:
  348. // We ran Commit, but didn't save the state, so replayBlock with mock app.
  349. abciResponses, err := sm.LoadABCIResponses(h.stateDB, storeBlockHeight)
  350. if err != nil {
  351. return nil, err
  352. }
  353. mockApp := newMockProxyApp(appHash, abciResponses)
  354. h.logger.Info("Replay last block using mock app")
  355. state, err = h.replayBlock(state, storeBlockHeight, mockApp)
  356. return state.AppHash, err
  357. }
  358. }
  359. panic(fmt.Sprintf("uncovered case! appHeight: %d, storeHeight: %d, stateHeight: %d",
  360. appBlockHeight, storeBlockHeight, stateBlockHeight))
  361. }
  362. func (h *Handshaker) replayBlocks(
  363. state sm.State,
  364. proxyApp proxy.AppConns,
  365. appBlockHeight,
  366. storeBlockHeight int64,
  367. mutateState bool) ([]byte, error) {
  368. // App is further behind than it should be, so we need to replay blocks.
  369. // We replay all blocks from appBlockHeight+1.
  370. //
  371. // Note that we don't have an old version of the state,
  372. // so we by-pass state validation/mutation using sm.ExecCommitBlock.
  373. // This also means we won't be saving validator sets if they change during this period.
  374. // TODO: Load the historical information to fix this and just use state.ApplyBlock
  375. //
  376. // If mutateState == true, the final block is replayed with h.replayBlock()
  377. var appHash []byte
  378. var err error
  379. finalBlock := storeBlockHeight
  380. if mutateState {
  381. finalBlock--
  382. }
  383. for i := appBlockHeight + 1; i <= finalBlock; i++ {
  384. h.logger.Info("Applying block", "height", i)
  385. block := h.store.LoadBlock(i)
  386. // Extra check to ensure the app was not changed in a way it shouldn't have.
  387. if len(appHash) > 0 {
  388. assertAppHashEqualsOneFromBlock(appHash, block)
  389. }
  390. appHash, err = sm.ExecCommitBlock(proxyApp.Consensus(), block, h.logger, h.stateDB)
  391. if err != nil {
  392. return nil, err
  393. }
  394. h.nBlocks++
  395. }
  396. if mutateState {
  397. // sync the final block
  398. state, err = h.replayBlock(state, storeBlockHeight, proxyApp.Consensus())
  399. if err != nil {
  400. return nil, err
  401. }
  402. appHash = state.AppHash
  403. }
  404. assertAppHashEqualsOneFromState(appHash, state)
  405. return appHash, nil
  406. }
  407. // ApplyBlock on the proxyApp with the last block.
  408. func (h *Handshaker) replayBlock(state sm.State, height int64, proxyApp proxy.AppConnConsensus) (sm.State, error) {
  409. block := h.store.LoadBlock(height)
  410. meta := h.store.LoadBlockMeta(height)
  411. blockExec := sm.NewBlockExecutor(h.stateDB, h.logger, proxyApp, mock.Mempool{}, sm.MockEvidencePool{})
  412. blockExec.SetEventBus(h.eventBus)
  413. var err error
  414. state, err = blockExec.ApplyBlock(state, meta.BlockID, block)
  415. if err != nil {
  416. return sm.State{}, err
  417. }
  418. h.nBlocks++
  419. return state, nil
  420. }
  421. func assertAppHashEqualsOneFromBlock(appHash []byte, block *types.Block) {
  422. if !bytes.Equal(appHash, block.AppHash) {
  423. panic(fmt.Sprintf(`block.AppHash does not match AppHash after replay. Got %X, expected %X.
  424. Block: %v
  425. `,
  426. appHash, block.AppHash, block))
  427. }
  428. }
  429. func assertAppHashEqualsOneFromState(appHash []byte, state sm.State) {
  430. if !bytes.Equal(appHash, state.AppHash) {
  431. panic(fmt.Sprintf(`state.AppHash does not match AppHash after replay. Got
  432. %X, expected %X.
  433. State: %v
  434. Did you reset Tendermint without resetting your application's data?`,
  435. appHash, state.AppHash, state))
  436. }
  437. }
  438. //--------------------------------------------------------------------------------
  439. // mockProxyApp uses ABCIResponses to give the right results
  440. // Useful because we don't want to call Commit() twice for the same block on the real app.
  441. func newMockProxyApp(appHash []byte, abciResponses *sm.ABCIResponses) proxy.AppConnConsensus {
  442. clientCreator := proxy.NewLocalClientCreator(&mockProxyApp{
  443. appHash: appHash,
  444. abciResponses: abciResponses,
  445. })
  446. cli, _ := clientCreator.NewABCIClient()
  447. err := cli.Start()
  448. if err != nil {
  449. panic(err)
  450. }
  451. return proxy.NewAppConnConsensus(cli)
  452. }
  453. type mockProxyApp struct {
  454. abci.BaseApplication
  455. appHash []byte
  456. txCount int
  457. abciResponses *sm.ABCIResponses
  458. }
  459. func (mock *mockProxyApp) DeliverTx(req abci.RequestDeliverTx) abci.ResponseDeliverTx {
  460. r := mock.abciResponses.DeliverTxs[mock.txCount]
  461. mock.txCount++
  462. if r == nil { //it could be nil because of amino unMarshall, it will cause an empty ResponseDeliverTx to become nil
  463. return abci.ResponseDeliverTx{}
  464. }
  465. return *r
  466. }
  467. func (mock *mockProxyApp) EndBlock(req abci.RequestEndBlock) abci.ResponseEndBlock {
  468. mock.txCount = 0
  469. return *mock.abciResponses.EndBlock
  470. }
  471. func (mock *mockProxyApp) Commit() abci.ResponseCommit {
  472. return abci.ResponseCommit{Data: mock.appHash}
  473. }