You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1235 lines
38 KiB

privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
5 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
Close and retry a RemoteSigner on err (#2923) * Close and recreate a RemoteSigner on err * Update changelog * Address Anton's comments / suggestions: - update changelog - restart TCPVal - shut down on `ErrUnexpectedResponse` * re-init remote signer client with fresh connection if Ping fails - add/update TODOs in secret connection - rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose * account for `conn returned by waitConnection can be `nil` - also add TODO about RemoteSigner conn field * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn - add rwmutex for conn field in IPC * comments and doc.go * fix ipc tests. fixes #2677 * use constants for tests * cleanup some error statements * fixes #2784, race in tests * remove print statement * minor fixes from review * update comment on sts spec * cosmetics * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * IPCVal signer refactor - use a .reset() method - don't use embedded RemoteSignerClient - guard RemoteSignerClient with mutex - drop the .conn - expose Close() on RemoteSignerClient * apply IPCVal refactor to TCPVal * remove mtx from RemoteSignerClient * consolidate IPCVal and TCPVal, fixes #3104 - done in tcp_client.go - now called SocketVal - takes a listener in the constructor - make tcpListener and unixListener contain all the differences * delete ipc files * introduce unix and tcp dialer for RemoteSigner * rename files - drop tcp_ prefix - rename priv_validator.go to file.go * bring back listener options * fix node * fix priv_val_server * fix node test * minor cleanup and comments
6 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
  1. package node
  2. import (
  3. "context"
  4. "errors"
  5. "fmt"
  6. "net"
  7. "net/http"
  8. _ "net/http/pprof" // nolint: gosec // securely exposed on separate, optional port
  9. "strconv"
  10. "time"
  11. _ "github.com/lib/pq" // provide the psql db driver
  12. "github.com/prometheus/client_golang/prometheus"
  13. "github.com/prometheus/client_golang/prometheus/promhttp"
  14. "github.com/rs/cors"
  15. abci "github.com/tendermint/tendermint/abci/types"
  16. cfg "github.com/tendermint/tendermint/config"
  17. "github.com/tendermint/tendermint/crypto"
  18. cs "github.com/tendermint/tendermint/internal/consensus"
  19. "github.com/tendermint/tendermint/internal/mempool"
  20. "github.com/tendermint/tendermint/internal/p2p"
  21. "github.com/tendermint/tendermint/internal/p2p/pex"
  22. "github.com/tendermint/tendermint/internal/statesync"
  23. "github.com/tendermint/tendermint/libs/log"
  24. tmnet "github.com/tendermint/tendermint/libs/net"
  25. tmpubsub "github.com/tendermint/tendermint/libs/pubsub"
  26. "github.com/tendermint/tendermint/libs/service"
  27. "github.com/tendermint/tendermint/libs/strings"
  28. tmtime "github.com/tendermint/tendermint/libs/time"
  29. "github.com/tendermint/tendermint/light"
  30. "github.com/tendermint/tendermint/privval"
  31. tmgrpc "github.com/tendermint/tendermint/privval/grpc"
  32. "github.com/tendermint/tendermint/proxy"
  33. rpccore "github.com/tendermint/tendermint/rpc/core"
  34. grpccore "github.com/tendermint/tendermint/rpc/grpc"
  35. rpcserver "github.com/tendermint/tendermint/rpc/jsonrpc/server"
  36. sm "github.com/tendermint/tendermint/state"
  37. "github.com/tendermint/tendermint/store"
  38. "github.com/tendermint/tendermint/types"
  39. )
  40. // nodeImpl is the highest level interface to a full Tendermint node.
  41. // It includes all configuration information and running services.
  42. type nodeImpl struct {
  43. service.BaseService
  44. // config
  45. config *cfg.Config
  46. genesisDoc *types.GenesisDoc // initial validator set
  47. privValidator types.PrivValidator // local node's validator key
  48. // network
  49. transport *p2p.MConnTransport
  50. sw *p2p.Switch // p2p connections
  51. peerManager *p2p.PeerManager
  52. router *p2p.Router
  53. addrBook pex.AddrBook // known peers
  54. nodeInfo types.NodeInfo
  55. nodeKey types.NodeKey // our node privkey
  56. isListening bool
  57. // services
  58. eventBus *types.EventBus // pub/sub for services
  59. stateStore sm.Store
  60. blockStore *store.BlockStore // store the blockchain to disk
  61. bcReactor service.Service // for block-syncing
  62. mempoolReactor service.Service // for gossipping transactions
  63. mempool mempool.Mempool
  64. stateSync bool // whether the node should state sync on startup
  65. stateSyncReactor *statesync.Reactor // for hosting and restoring state sync snapshots
  66. consensusReactor *cs.Reactor // for participating in the consensus
  67. pexReactor service.Service // for exchanging peer addresses
  68. evidenceReactor service.Service
  69. rpcListeners []net.Listener // rpc servers
  70. indexerService service.Service
  71. rpcEnv *rpccore.Environment
  72. prometheusSrv *http.Server
  73. }
  74. // newDefaultNode returns a Tendermint node with default settings for the
  75. // PrivValidator, ClientCreator, GenesisDoc, and DBProvider.
  76. // It implements NodeProvider.
  77. func newDefaultNode(config *cfg.Config, logger log.Logger) (service.Service, error) {
  78. nodeKey, err := types.LoadOrGenNodeKey(config.NodeKeyFile())
  79. if err != nil {
  80. return nil, fmt.Errorf("failed to load or gen node key %s: %w", config.NodeKeyFile(), err)
  81. }
  82. if config.Mode == cfg.ModeSeed {
  83. return makeSeedNode(config,
  84. cfg.DefaultDBProvider,
  85. nodeKey,
  86. defaultGenesisDocProviderFunc(config),
  87. logger,
  88. )
  89. }
  90. var pval *privval.FilePV
  91. if config.Mode == cfg.ModeValidator {
  92. pval, err = privval.LoadOrGenFilePV(config.PrivValidator.KeyFile(), config.PrivValidator.StateFile())
  93. if err != nil {
  94. return nil, err
  95. }
  96. } else {
  97. pval = nil
  98. }
  99. appClient, _ := proxy.DefaultClientCreator(config.ProxyApp, config.ABCI, config.DBDir())
  100. return makeNode(config,
  101. pval,
  102. nodeKey,
  103. appClient,
  104. defaultGenesisDocProviderFunc(config),
  105. cfg.DefaultDBProvider,
  106. logger,
  107. )
  108. }
  109. // makeNode returns a new, ready to go, Tendermint Node.
  110. func makeNode(config *cfg.Config,
  111. privValidator types.PrivValidator,
  112. nodeKey types.NodeKey,
  113. clientCreator proxy.ClientCreator,
  114. genesisDocProvider genesisDocProvider,
  115. dbProvider cfg.DBProvider,
  116. logger log.Logger) (service.Service, error) {
  117. blockStore, stateDB, err := initDBs(config, dbProvider)
  118. if err != nil {
  119. return nil, err
  120. }
  121. stateStore := sm.NewStore(stateDB)
  122. genDoc, err := genesisDocProvider()
  123. if err != nil {
  124. return nil, err
  125. }
  126. err = genDoc.ValidateAndComplete()
  127. if err != nil {
  128. return nil, fmt.Errorf("error in genesis doc: %w", err)
  129. }
  130. state, err := loadStateFromDBOrGenesisDocProvider(stateStore, genDoc)
  131. if err != nil {
  132. return nil, err
  133. }
  134. // Create the proxyApp and establish connections to the ABCI app (consensus, mempool, query).
  135. proxyApp, err := createAndStartProxyAppConns(clientCreator, logger)
  136. if err != nil {
  137. return nil, err
  138. }
  139. // EventBus and IndexerService must be started before the handshake because
  140. // we might need to index the txs of the replayed block as this might not have happened
  141. // when the node stopped last time (i.e. the node stopped after it saved the block
  142. // but before it indexed the txs, or, endblocker panicked)
  143. eventBus, err := createAndStartEventBus(logger)
  144. if err != nil {
  145. return nil, err
  146. }
  147. indexerService, eventSinks, err := createAndStartIndexerService(config, dbProvider, eventBus, logger, genDoc.ChainID)
  148. if err != nil {
  149. return nil, err
  150. }
  151. // If an address is provided, listen on the socket for a connection from an
  152. // external signing process.
  153. if config.PrivValidator.ListenAddr != "" {
  154. protocol, _ := tmnet.ProtocolAndAddress(config.PrivValidator.ListenAddr)
  155. // FIXME: we should start services inside OnStart
  156. switch protocol {
  157. case "grpc":
  158. privValidator, err = createAndStartPrivValidatorGRPCClient(config, genDoc.ChainID, logger)
  159. if err != nil {
  160. return nil, fmt.Errorf("error with private validator grpc client: %w", err)
  161. }
  162. default:
  163. privValidator, err = createAndStartPrivValidatorSocketClient(config.PrivValidator.ListenAddr, genDoc.ChainID, logger)
  164. if err != nil {
  165. return nil, fmt.Errorf("error with private validator socket client: %w", err)
  166. }
  167. }
  168. }
  169. var pubKey crypto.PubKey
  170. if config.Mode == cfg.ModeValidator {
  171. pubKey, err = privValidator.GetPubKey(context.TODO())
  172. if err != nil {
  173. return nil, fmt.Errorf("can't get pubkey: %w", err)
  174. }
  175. if pubKey == nil {
  176. return nil, errors.New("could not retrieve public key from private validator")
  177. }
  178. }
  179. // Determine whether we should attempt state sync.
  180. stateSync := config.StateSync.Enable && !onlyValidatorIsUs(state, pubKey)
  181. if stateSync && state.LastBlockHeight > 0 {
  182. logger.Info("Found local state with non-zero height, skipping state sync")
  183. stateSync = false
  184. }
  185. // Create the handshaker, which calls RequestInfo, sets the AppVersion on the state,
  186. // and replays any blocks as necessary to sync tendermint with the app.
  187. consensusLogger := logger.With("module", "consensus")
  188. if !stateSync {
  189. if err := doHandshake(stateStore, state, blockStore, genDoc, eventBus, proxyApp, consensusLogger); err != nil {
  190. return nil, err
  191. }
  192. // Reload the state. It will have the Version.Consensus.App set by the
  193. // Handshake, and may have other modifications as well (ie. depending on
  194. // what happened during block replay).
  195. state, err = stateStore.Load()
  196. if err != nil {
  197. return nil, fmt.Errorf("cannot load state: %w", err)
  198. }
  199. }
  200. // Determine whether we should do block sync. This must happen after the handshake, since the
  201. // app may modify the validator set, specifying ourself as the only validator.
  202. blockSync := config.FastSyncMode && !onlyValidatorIsUs(state, pubKey)
  203. logNodeStartupInfo(state, pubKey, logger, consensusLogger, config.Mode)
  204. // TODO: Fetch and provide real options and do proper p2p bootstrapping.
  205. // TODO: Use a persistent peer database.
  206. nodeInfo, err := makeNodeInfo(config, nodeKey, eventSinks, genDoc, state)
  207. if err != nil {
  208. return nil, err
  209. }
  210. p2pLogger := logger.With("module", "p2p")
  211. transport := createTransport(p2pLogger, config)
  212. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  213. if err != nil {
  214. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  215. }
  216. csMetrics, p2pMetrics, memplMetrics, smMetrics := defaultMetricsProvider(config.Instrumentation)(genDoc.ChainID)
  217. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  218. peerManager, transport, getRouterConfig(config, proxyApp))
  219. if err != nil {
  220. return nil, fmt.Errorf("failed to create router: %w", err)
  221. }
  222. mpReactorShim, mpReactor, mp, err := createMempoolReactor(
  223. config, proxyApp, state, memplMetrics, peerManager, router, logger,
  224. )
  225. if err != nil {
  226. return nil, err
  227. }
  228. evReactorShim, evReactor, evPool, err := createEvidenceReactor(
  229. config, dbProvider, stateDB, blockStore, peerManager, router, logger,
  230. )
  231. if err != nil {
  232. return nil, err
  233. }
  234. // make block executor for consensus and blockchain reactors to execute blocks
  235. blockExec := sm.NewBlockExecutor(
  236. stateStore,
  237. logger.With("module", "state"),
  238. proxyApp.Consensus(),
  239. mp,
  240. evPool,
  241. blockStore,
  242. sm.BlockExecutorWithMetrics(smMetrics),
  243. )
  244. csReactorShim, csReactor, csState := createConsensusReactor(
  245. config, state, blockExec, blockStore, mp, evPool,
  246. privValidator, csMetrics, stateSync || blockSync, eventBus,
  247. peerManager, router, consensusLogger,
  248. )
  249. // Create the blockchain reactor. Note, we do not start block sync if we're
  250. // doing a state sync first.
  251. bcReactorShim, bcReactor, err := createBlockchainReactor(
  252. logger, config, state, blockExec, blockStore, csReactor,
  253. peerManager, router, blockSync && !stateSync, csMetrics,
  254. )
  255. if err != nil {
  256. return nil, fmt.Errorf("could not create blockchain reactor: %w", err)
  257. }
  258. // TODO: Remove this once the switch is removed.
  259. var bcReactorForSwitch p2p.Reactor
  260. if bcReactorShim != nil {
  261. bcReactorForSwitch = bcReactorShim
  262. } else {
  263. bcReactorForSwitch = bcReactor.(p2p.Reactor)
  264. }
  265. // Make ConsensusReactor. Don't enable fully if doing a state sync and/or block sync first.
  266. // FIXME We need to update metrics here, since other reactors don't have access to them.
  267. if stateSync {
  268. csMetrics.StateSyncing.Set(1)
  269. } else if blockSync {
  270. csMetrics.BlockSyncing.Set(1)
  271. }
  272. // Set up state sync reactor, and schedule a sync if requested.
  273. // FIXME The way we do phased startups (e.g. replay -> block sync -> consensus) is very messy,
  274. // we should clean this whole thing up. See:
  275. // https://github.com/tendermint/tendermint/issues/4644
  276. var (
  277. stateSyncReactor *statesync.Reactor
  278. stateSyncReactorShim *p2p.ReactorShim
  279. channels map[p2p.ChannelID]*p2p.Channel
  280. peerUpdates *p2p.PeerUpdates
  281. )
  282. stateSyncReactorShim = p2p.NewReactorShim(logger.With("module", "statesync"), "StateSyncShim", statesync.ChannelShims)
  283. if config.P2P.DisableLegacy {
  284. channels = makeChannelsFromShims(router, statesync.ChannelShims)
  285. peerUpdates = peerManager.Subscribe()
  286. } else {
  287. channels = getChannelsFromShim(stateSyncReactorShim)
  288. peerUpdates = stateSyncReactorShim.PeerUpdates
  289. }
  290. stateSyncReactor = statesync.NewReactor(
  291. *config.StateSync,
  292. stateSyncReactorShim.Logger,
  293. proxyApp.Snapshot(),
  294. proxyApp.Query(),
  295. channels[statesync.SnapshotChannel],
  296. channels[statesync.ChunkChannel],
  297. channels[statesync.LightBlockChannel],
  298. peerUpdates,
  299. stateStore,
  300. blockStore,
  301. config.StateSync.TempDir,
  302. )
  303. // add the channel descriptors to both the transports
  304. // FIXME: This should be removed when the legacy p2p stack is removed and
  305. // transports can either be agnostic to channel descriptors or can be
  306. // declared in the constructor.
  307. transport.AddChannelDescriptors(mpReactorShim.GetChannels())
  308. transport.AddChannelDescriptors(bcReactorForSwitch.GetChannels())
  309. transport.AddChannelDescriptors(csReactorShim.GetChannels())
  310. transport.AddChannelDescriptors(evReactorShim.GetChannels())
  311. transport.AddChannelDescriptors(stateSyncReactorShim.GetChannels())
  312. // Optionally, start the pex reactor
  313. //
  314. // TODO:
  315. //
  316. // We need to set Seeds and PersistentPeers on the switch,
  317. // since it needs to be able to use these (and their DNS names)
  318. // even if the PEX is off. We can include the DNS name in the NetAddress,
  319. // but it would still be nice to have a clear list of the current "PersistentPeers"
  320. // somewhere that we can return with net_info.
  321. //
  322. // If PEX is on, it should handle dialing the seeds. Otherwise the switch does it.
  323. // Note we currently use the addrBook regardless at least for AddOurAddress
  324. var (
  325. pexReactor service.Service
  326. sw *p2p.Switch
  327. addrBook pex.AddrBook
  328. )
  329. pexCh := pex.ChannelDescriptor()
  330. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  331. if config.P2P.DisableLegacy {
  332. addrBook = nil
  333. pexReactor, err = createPEXReactorV2(config, logger, peerManager, router)
  334. if err != nil {
  335. return nil, err
  336. }
  337. } else {
  338. // setup Transport and Switch
  339. sw = createSwitch(
  340. config, transport, p2pMetrics, mpReactorShim, bcReactorForSwitch,
  341. stateSyncReactorShim, csReactorShim, evReactorShim, proxyApp, nodeInfo, nodeKey, p2pLogger,
  342. )
  343. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  344. if err != nil {
  345. return nil, fmt.Errorf("could not add peers from persistent-peers field: %w", err)
  346. }
  347. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  348. if err != nil {
  349. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  350. }
  351. addrBook, err = createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  352. if err != nil {
  353. return nil, fmt.Errorf("could not create addrbook: %w", err)
  354. }
  355. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  356. }
  357. if config.RPC.PprofListenAddress != "" {
  358. go func() {
  359. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  360. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  361. }()
  362. }
  363. node := &nodeImpl{
  364. config: config,
  365. genesisDoc: genDoc,
  366. privValidator: privValidator,
  367. transport: transport,
  368. sw: sw,
  369. peerManager: peerManager,
  370. router: router,
  371. addrBook: addrBook,
  372. nodeInfo: nodeInfo,
  373. nodeKey: nodeKey,
  374. stateStore: stateStore,
  375. blockStore: blockStore,
  376. bcReactor: bcReactor,
  377. mempoolReactor: mpReactor,
  378. mempool: mp,
  379. consensusReactor: csReactor,
  380. stateSyncReactor: stateSyncReactor,
  381. stateSync: stateSync,
  382. pexReactor: pexReactor,
  383. evidenceReactor: evReactor,
  384. indexerService: indexerService,
  385. eventBus: eventBus,
  386. rpcEnv: &rpccore.Environment{
  387. ProxyAppQuery: proxyApp.Query(),
  388. ProxyAppMempool: proxyApp.Mempool(),
  389. StateStore: stateStore,
  390. BlockStore: blockStore,
  391. EvidencePool: evPool,
  392. ConsensusState: csState,
  393. BlockSyncReactor: bcReactor.(cs.BlockSyncReactor),
  394. P2PPeers: sw,
  395. PeerManager: peerManager,
  396. GenDoc: genDoc,
  397. EventSinks: eventSinks,
  398. ConsensusReactor: csReactor,
  399. EventBus: eventBus,
  400. Mempool: mp,
  401. Logger: logger.With("module", "rpc"),
  402. Config: *config.RPC,
  403. },
  404. }
  405. node.rpcEnv.P2PTransport = node
  406. node.BaseService = *service.NewBaseService(logger, "Node", node)
  407. return node, nil
  408. }
  409. // makeSeedNode returns a new seed node, containing only p2p, pex reactor
  410. func makeSeedNode(config *cfg.Config,
  411. dbProvider cfg.DBProvider,
  412. nodeKey types.NodeKey,
  413. genesisDocProvider genesisDocProvider,
  414. logger log.Logger,
  415. ) (service.Service, error) {
  416. genDoc, err := genesisDocProvider()
  417. if err != nil {
  418. return nil, err
  419. }
  420. state, err := sm.MakeGenesisState(genDoc)
  421. if err != nil {
  422. return nil, err
  423. }
  424. nodeInfo, err := makeSeedNodeInfo(config, nodeKey, genDoc, state)
  425. if err != nil {
  426. return nil, err
  427. }
  428. // Setup Transport and Switch.
  429. p2pMetrics := p2p.PrometheusMetrics(config.Instrumentation.Namespace, "chain_id", genDoc.ChainID)
  430. p2pLogger := logger.With("module", "p2p")
  431. transport := createTransport(p2pLogger, config)
  432. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  433. if err != nil {
  434. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  435. }
  436. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  437. peerManager, transport, getRouterConfig(config, nil))
  438. if err != nil {
  439. return nil, fmt.Errorf("failed to create router: %w", err)
  440. }
  441. var (
  442. pexReactor service.Service
  443. sw *p2p.Switch
  444. addrBook pex.AddrBook
  445. )
  446. // add the pex reactor
  447. // FIXME: we add channel descriptors to both the router and the transport but only the router
  448. // should be aware of channel info. We should remove this from transport once the legacy
  449. // p2p stack is removed.
  450. pexCh := pex.ChannelDescriptor()
  451. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  452. if config.P2P.DisableLegacy {
  453. pexReactor, err = createPEXReactorV2(config, logger, peerManager, router)
  454. if err != nil {
  455. return nil, err
  456. }
  457. } else {
  458. sw = createSwitch(
  459. config, transport, p2pMetrics, nil, nil,
  460. nil, nil, nil, nil, nodeInfo, nodeKey, p2pLogger,
  461. )
  462. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  463. if err != nil {
  464. return nil, fmt.Errorf("could not add peers from persistent_peers field: %w", err)
  465. }
  466. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  467. if err != nil {
  468. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  469. }
  470. addrBook, err = createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  471. if err != nil {
  472. return nil, fmt.Errorf("could not create addrbook: %w", err)
  473. }
  474. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  475. }
  476. if config.RPC.PprofListenAddress != "" {
  477. go func() {
  478. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  479. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  480. }()
  481. }
  482. node := &nodeImpl{
  483. config: config,
  484. genesisDoc: genDoc,
  485. transport: transport,
  486. sw: sw,
  487. addrBook: addrBook,
  488. nodeInfo: nodeInfo,
  489. nodeKey: nodeKey,
  490. peerManager: peerManager,
  491. router: router,
  492. pexReactor: pexReactor,
  493. }
  494. node.BaseService = *service.NewBaseService(logger, "SeedNode", node)
  495. return node, nil
  496. }
  497. // OnStart starts the Node. It implements service.Service.
  498. func (n *nodeImpl) OnStart() error {
  499. now := tmtime.Now()
  500. genTime := n.genesisDoc.GenesisTime
  501. if genTime.After(now) {
  502. n.Logger.Info("Genesis time is in the future. Sleeping until then...", "genTime", genTime)
  503. time.Sleep(genTime.Sub(now))
  504. }
  505. // Start the RPC server before the P2P server
  506. // so we can eg. receive txs for the first block
  507. if n.config.RPC.ListenAddress != "" && n.config.Mode != cfg.ModeSeed {
  508. listeners, err := n.startRPC()
  509. if err != nil {
  510. return err
  511. }
  512. n.rpcListeners = listeners
  513. }
  514. if n.config.Instrumentation.Prometheus &&
  515. n.config.Instrumentation.PrometheusListenAddr != "" {
  516. n.prometheusSrv = n.startPrometheusServer(n.config.Instrumentation.PrometheusListenAddr)
  517. }
  518. // Start the transport.
  519. addr, err := types.NewNetAddressString(n.nodeKey.ID.AddressString(n.config.P2P.ListenAddress))
  520. if err != nil {
  521. return err
  522. }
  523. if err := n.transport.Listen(p2p.NewEndpoint(addr)); err != nil {
  524. return err
  525. }
  526. n.isListening = true
  527. n.Logger.Info("p2p service", "legacy_enabled", !n.config.P2P.DisableLegacy)
  528. if n.config.P2P.DisableLegacy {
  529. if err = n.router.Start(); err != nil {
  530. return err
  531. }
  532. } else {
  533. // Add private IDs to addrbook to block those peers being added
  534. n.addrBook.AddPrivateIDs(strings.SplitAndTrimEmpty(n.config.P2P.PrivatePeerIDs, ",", " "))
  535. if err = n.sw.Start(); err != nil {
  536. return err
  537. }
  538. }
  539. if n.config.Mode != cfg.ModeSeed {
  540. if n.config.BlockSync.Version == cfg.BlockSyncV0 {
  541. if err := n.bcReactor.Start(); err != nil {
  542. return err
  543. }
  544. }
  545. // Start the real consensus reactor separately since the switch uses the shim.
  546. if err := n.consensusReactor.Start(); err != nil {
  547. return err
  548. }
  549. // Start the real state sync reactor separately since the switch uses the shim.
  550. if err := n.stateSyncReactor.Start(); err != nil {
  551. return err
  552. }
  553. // Start the real mempool reactor separately since the switch uses the shim.
  554. if err := n.mempoolReactor.Start(); err != nil {
  555. return err
  556. }
  557. // Start the real evidence reactor separately since the switch uses the shim.
  558. if err := n.evidenceReactor.Start(); err != nil {
  559. return err
  560. }
  561. }
  562. if n.config.P2P.DisableLegacy {
  563. if err := n.pexReactor.Start(); err != nil {
  564. return err
  565. }
  566. } else {
  567. // Always connect to persistent peers
  568. err = n.sw.DialPeersAsync(strings.SplitAndTrimEmpty(n.config.P2P.PersistentPeers, ",", " "))
  569. if err != nil {
  570. return fmt.Errorf("could not dial peers from persistent-peers field: %w", err)
  571. }
  572. }
  573. // Run state sync
  574. if n.stateSync {
  575. bcR, ok := n.bcReactor.(cs.BlockSyncReactor)
  576. if !ok {
  577. return fmt.Errorf("this blockchain reactor does not support switching from state sync")
  578. }
  579. // we need to get the genesis state to get parameters such as
  580. state, err := sm.MakeGenesisState(n.genesisDoc)
  581. if err != nil {
  582. return fmt.Errorf("unable to derive state: %w", err)
  583. }
  584. ssc := n.config.StateSync
  585. sp, err := constructStateProvider(ssc, state, n.Logger.With("module", "light"))
  586. if err != nil {
  587. return fmt.Errorf("failed to set up light client state provider: %w", err)
  588. }
  589. if err := startStateSync(n.stateSyncReactor, bcR, n.consensusReactor, sp,
  590. ssc, n.config.FastSyncMode, state.InitialHeight, n.eventBus); err != nil {
  591. return fmt.Errorf("failed to start state sync: %w", err)
  592. }
  593. }
  594. return nil
  595. }
  596. // OnStop stops the Node. It implements service.Service.
  597. func (n *nodeImpl) OnStop() {
  598. n.Logger.Info("Stopping Node")
  599. // first stop the non-reactor services
  600. if err := n.eventBus.Stop(); err != nil {
  601. n.Logger.Error("Error closing eventBus", "err", err)
  602. }
  603. if err := n.indexerService.Stop(); err != nil {
  604. n.Logger.Error("Error closing indexerService", "err", err)
  605. }
  606. if n.config.Mode != cfg.ModeSeed {
  607. // now stop the reactors
  608. if n.config.BlockSync.Version == cfg.BlockSyncV0 {
  609. // Stop the real blockchain reactor separately since the switch uses the shim.
  610. if err := n.bcReactor.Stop(); err != nil {
  611. n.Logger.Error("failed to stop the blockchain reactor", "err", err)
  612. }
  613. }
  614. // Stop the real consensus reactor separately since the switch uses the shim.
  615. if err := n.consensusReactor.Stop(); err != nil {
  616. n.Logger.Error("failed to stop the consensus reactor", "err", err)
  617. }
  618. // Stop the real state sync reactor separately since the switch uses the shim.
  619. if err := n.stateSyncReactor.Stop(); err != nil {
  620. n.Logger.Error("failed to stop the state sync reactor", "err", err)
  621. }
  622. // Stop the real mempool reactor separately since the switch uses the shim.
  623. if err := n.mempoolReactor.Stop(); err != nil {
  624. n.Logger.Error("failed to stop the mempool reactor", "err", err)
  625. }
  626. // Stop the real evidence reactor separately since the switch uses the shim.
  627. if err := n.evidenceReactor.Stop(); err != nil {
  628. n.Logger.Error("failed to stop the evidence reactor", "err", err)
  629. }
  630. }
  631. if err := n.pexReactor.Stop(); err != nil {
  632. n.Logger.Error("failed to stop the PEX v2 reactor", "err", err)
  633. }
  634. if n.config.P2P.DisableLegacy {
  635. if err := n.router.Stop(); err != nil {
  636. n.Logger.Error("failed to stop router", "err", err)
  637. }
  638. } else {
  639. if err := n.sw.Stop(); err != nil {
  640. n.Logger.Error("failed to stop switch", "err", err)
  641. }
  642. }
  643. if err := n.transport.Close(); err != nil {
  644. n.Logger.Error("Error closing transport", "err", err)
  645. }
  646. n.isListening = false
  647. // finally stop the listeners / external services
  648. for _, l := range n.rpcListeners {
  649. n.Logger.Info("Closing rpc listener", "listener", l)
  650. if err := l.Close(); err != nil {
  651. n.Logger.Error("Error closing listener", "listener", l, "err", err)
  652. }
  653. }
  654. if pvsc, ok := n.privValidator.(service.Service); ok {
  655. if err := pvsc.Stop(); err != nil {
  656. n.Logger.Error("Error closing private validator", "err", err)
  657. }
  658. }
  659. if n.prometheusSrv != nil {
  660. if err := n.prometheusSrv.Shutdown(context.Background()); err != nil {
  661. // Error from closing listeners, or context timeout:
  662. n.Logger.Error("Prometheus HTTP server Shutdown", "err", err)
  663. }
  664. }
  665. }
  666. func (n *nodeImpl) startRPC() ([]net.Listener, error) {
  667. if n.config.Mode == cfg.ModeValidator {
  668. pubKey, err := n.privValidator.GetPubKey(context.TODO())
  669. if pubKey == nil || err != nil {
  670. return nil, fmt.Errorf("can't get pubkey: %w", err)
  671. }
  672. n.rpcEnv.PubKey = pubKey
  673. }
  674. if err := n.rpcEnv.InitGenesisChunks(); err != nil {
  675. return nil, err
  676. }
  677. listenAddrs := strings.SplitAndTrimEmpty(n.config.RPC.ListenAddress, ",", " ")
  678. routes := n.rpcEnv.GetRoutes()
  679. if n.config.RPC.Unsafe {
  680. n.rpcEnv.AddUnsafe(routes)
  681. }
  682. config := rpcserver.DefaultConfig()
  683. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  684. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  685. config.MaxOpenConnections = n.config.RPC.MaxOpenConnections
  686. // If necessary adjust global WriteTimeout to ensure it's greater than
  687. // TimeoutBroadcastTxCommit.
  688. // See https://github.com/tendermint/tendermint/issues/3435
  689. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  690. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  691. }
  692. // we may expose the rpc over both a unix and tcp socket
  693. listeners := make([]net.Listener, len(listenAddrs))
  694. for i, listenAddr := range listenAddrs {
  695. mux := http.NewServeMux()
  696. rpcLogger := n.Logger.With("module", "rpc-server")
  697. wmLogger := rpcLogger.With("protocol", "websocket")
  698. wm := rpcserver.NewWebsocketManager(routes,
  699. rpcserver.OnDisconnect(func(remoteAddr string) {
  700. err := n.eventBus.UnsubscribeAll(context.Background(), remoteAddr)
  701. if err != nil && err != tmpubsub.ErrSubscriptionNotFound {
  702. wmLogger.Error("Failed to unsubscribe addr from events", "addr", remoteAddr, "err", err)
  703. }
  704. }),
  705. rpcserver.ReadLimit(config.MaxBodyBytes),
  706. )
  707. wm.SetLogger(wmLogger)
  708. mux.HandleFunc("/websocket", wm.WebsocketHandler)
  709. rpcserver.RegisterRPCFuncs(mux, routes, rpcLogger)
  710. listener, err := rpcserver.Listen(
  711. listenAddr,
  712. config,
  713. )
  714. if err != nil {
  715. return nil, err
  716. }
  717. var rootHandler http.Handler = mux
  718. if n.config.RPC.IsCorsEnabled() {
  719. corsMiddleware := cors.New(cors.Options{
  720. AllowedOrigins: n.config.RPC.CORSAllowedOrigins,
  721. AllowedMethods: n.config.RPC.CORSAllowedMethods,
  722. AllowedHeaders: n.config.RPC.CORSAllowedHeaders,
  723. })
  724. rootHandler = corsMiddleware.Handler(mux)
  725. }
  726. if n.config.RPC.IsTLSEnabled() {
  727. go func() {
  728. if err := rpcserver.ServeTLS(
  729. listener,
  730. rootHandler,
  731. n.config.RPC.CertFile(),
  732. n.config.RPC.KeyFile(),
  733. rpcLogger,
  734. config,
  735. ); err != nil {
  736. n.Logger.Error("Error serving server with TLS", "err", err)
  737. }
  738. }()
  739. } else {
  740. go func() {
  741. if err := rpcserver.Serve(
  742. listener,
  743. rootHandler,
  744. rpcLogger,
  745. config,
  746. ); err != nil {
  747. n.Logger.Error("Error serving server", "err", err)
  748. }
  749. }()
  750. }
  751. listeners[i] = listener
  752. }
  753. // we expose a simplified api over grpc for convenience to app devs
  754. grpcListenAddr := n.config.RPC.GRPCListenAddress
  755. if grpcListenAddr != "" {
  756. config := rpcserver.DefaultConfig()
  757. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  758. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  759. // NOTE: GRPCMaxOpenConnections is used, not MaxOpenConnections
  760. config.MaxOpenConnections = n.config.RPC.GRPCMaxOpenConnections
  761. // If necessary adjust global WriteTimeout to ensure it's greater than
  762. // TimeoutBroadcastTxCommit.
  763. // See https://github.com/tendermint/tendermint/issues/3435
  764. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  765. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  766. }
  767. listener, err := rpcserver.Listen(grpcListenAddr, config)
  768. if err != nil {
  769. return nil, err
  770. }
  771. go func() {
  772. if err := grpccore.StartGRPCServer(n.rpcEnv, listener); err != nil {
  773. n.Logger.Error("Error starting gRPC server", "err", err)
  774. }
  775. }()
  776. listeners = append(listeners, listener)
  777. }
  778. return listeners, nil
  779. }
  780. // startPrometheusServer starts a Prometheus HTTP server, listening for metrics
  781. // collectors on addr.
  782. func (n *nodeImpl) startPrometheusServer(addr string) *http.Server {
  783. srv := &http.Server{
  784. Addr: addr,
  785. Handler: promhttp.InstrumentMetricHandler(
  786. prometheus.DefaultRegisterer, promhttp.HandlerFor(
  787. prometheus.DefaultGatherer,
  788. promhttp.HandlerOpts{MaxRequestsInFlight: n.config.Instrumentation.MaxOpenConnections},
  789. ),
  790. ),
  791. }
  792. go func() {
  793. if err := srv.ListenAndServe(); err != http.ErrServerClosed {
  794. // Error starting or closing listener:
  795. n.Logger.Error("Prometheus HTTP server ListenAndServe", "err", err)
  796. }
  797. }()
  798. return srv
  799. }
  800. // ConsensusReactor returns the Node's ConsensusReactor.
  801. func (n *nodeImpl) ConsensusReactor() *cs.Reactor {
  802. return n.consensusReactor
  803. }
  804. // Mempool returns the Node's mempool.
  805. func (n *nodeImpl) Mempool() mempool.Mempool {
  806. return n.mempool
  807. }
  808. // EventBus returns the Node's EventBus.
  809. func (n *nodeImpl) EventBus() *types.EventBus {
  810. return n.eventBus
  811. }
  812. // PrivValidator returns the Node's PrivValidator.
  813. // XXX: for convenience only!
  814. func (n *nodeImpl) PrivValidator() types.PrivValidator {
  815. return n.privValidator
  816. }
  817. // GenesisDoc returns the Node's GenesisDoc.
  818. func (n *nodeImpl) GenesisDoc() *types.GenesisDoc {
  819. return n.genesisDoc
  820. }
  821. // RPCEnvironment makes sure RPC has all the objects it needs to operate.
  822. func (n *nodeImpl) RPCEnvironment() *rpccore.Environment {
  823. return n.rpcEnv
  824. }
  825. //------------------------------------------------------------------------------
  826. func (n *nodeImpl) Listeners() []string {
  827. return []string{
  828. fmt.Sprintf("Listener(@%v)", n.config.P2P.ExternalAddress),
  829. }
  830. }
  831. func (n *nodeImpl) IsListening() bool {
  832. return n.isListening
  833. }
  834. // NodeInfo returns the Node's Info from the Switch.
  835. func (n *nodeImpl) NodeInfo() types.NodeInfo {
  836. return n.nodeInfo
  837. }
  838. // startStateSync starts an asynchronous state sync process, then switches to block sync mode.
  839. func startStateSync(
  840. ssR statesync.SyncReactor,
  841. bcR cs.BlockSyncReactor,
  842. conR cs.ConsSyncReactor,
  843. sp statesync.StateProvider,
  844. config *cfg.StateSyncConfig,
  845. blockSync bool,
  846. stateInitHeight int64,
  847. eb *types.EventBus,
  848. ) error {
  849. stateSyncLogger := eb.Logger.With("module", "statesync")
  850. stateSyncLogger.Info("starting state sync...")
  851. // at the beginning of the statesync start, we use the initialHeight as the event height
  852. // because of the statesync doesn't have the concreate state height before fetched the snapshot.
  853. d := types.EventDataStateSyncStatus{Complete: false, Height: stateInitHeight}
  854. if err := eb.PublishEventStateSyncStatus(d); err != nil {
  855. stateSyncLogger.Error("failed to emit the statesync start event", "err", err)
  856. }
  857. go func() {
  858. state, err := ssR.Sync(context.TODO(), sp, config.DiscoveryTime)
  859. if err != nil {
  860. stateSyncLogger.Error("state sync failed", "err", err)
  861. return
  862. }
  863. if err := ssR.Backfill(state); err != nil {
  864. stateSyncLogger.Error("backfill failed; node has insufficient history to verify all evidence;"+
  865. " proceeding optimistically...", "err", err)
  866. }
  867. conR.SetStateSyncingMetrics(0)
  868. d := types.EventDataStateSyncStatus{Complete: true, Height: state.LastBlockHeight}
  869. if err := eb.PublishEventStateSyncStatus(d); err != nil {
  870. stateSyncLogger.Error("failed to emit the statesync start event", "err", err)
  871. }
  872. if blockSync {
  873. // FIXME Very ugly to have these metrics bleed through here.
  874. conR.SetBlockSyncingMetrics(1)
  875. if err := bcR.SwitchToBlockSync(state); err != nil {
  876. stateSyncLogger.Error("failed to switch to block sync", "err", err)
  877. return
  878. }
  879. d := types.EventDataBlockSyncStatus{Complete: false, Height: state.LastBlockHeight}
  880. if err := eb.PublishEventBlockSyncStatus(d); err != nil {
  881. stateSyncLogger.Error("failed to emit the block sync starting event", "err", err)
  882. }
  883. } else {
  884. conR.SwitchToConsensus(state, true)
  885. }
  886. }()
  887. return nil
  888. }
  889. // genesisDocProvider returns a GenesisDoc.
  890. // It allows the GenesisDoc to be pulled from sources other than the
  891. // filesystem, for instance from a distributed key-value store cluster.
  892. type genesisDocProvider func() (*types.GenesisDoc, error)
  893. // defaultGenesisDocProviderFunc returns a GenesisDocProvider that loads
  894. // the GenesisDoc from the config.GenesisFile() on the filesystem.
  895. func defaultGenesisDocProviderFunc(config *cfg.Config) genesisDocProvider {
  896. return func() (*types.GenesisDoc, error) {
  897. return types.GenesisDocFromFile(config.GenesisFile())
  898. }
  899. }
  900. // metricsProvider returns a consensus, p2p and mempool Metrics.
  901. type metricsProvider func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics)
  902. // defaultMetricsProvider returns Metrics build using Prometheus client library
  903. // if Prometheus is enabled. Otherwise, it returns no-op Metrics.
  904. func defaultMetricsProvider(config *cfg.InstrumentationConfig) metricsProvider {
  905. return func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics) {
  906. if config.Prometheus {
  907. return cs.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  908. p2p.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  909. mempool.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  910. sm.PrometheusMetrics(config.Namespace, "chain_id", chainID)
  911. }
  912. return cs.NopMetrics(), p2p.NopMetrics(), mempool.NopMetrics(), sm.NopMetrics()
  913. }
  914. }
  915. //------------------------------------------------------------------------------
  916. // loadStateFromDBOrGenesisDocProvider attempts to load the state from the
  917. // database, or creates one using the given genesisDocProvider. On success this also
  918. // returns the genesis doc loaded through the given provider.
  919. func loadStateFromDBOrGenesisDocProvider(
  920. stateStore sm.Store,
  921. genDoc *types.GenesisDoc,
  922. ) (sm.State, error) {
  923. // 1. Attempt to load state form the database
  924. state, err := stateStore.Load()
  925. if err != nil {
  926. return sm.State{}, err
  927. }
  928. if state.IsEmpty() {
  929. // 2. If it's not there, derive it from the genesis doc
  930. state, err = sm.MakeGenesisState(genDoc)
  931. if err != nil {
  932. return sm.State{}, err
  933. }
  934. }
  935. return state, nil
  936. }
  937. func createAndStartPrivValidatorSocketClient(
  938. listenAddr,
  939. chainID string,
  940. logger log.Logger,
  941. ) (types.PrivValidator, error) {
  942. pve, err := privval.NewSignerListener(listenAddr, logger)
  943. if err != nil {
  944. return nil, fmt.Errorf("failed to start private validator: %w", err)
  945. }
  946. pvsc, err := privval.NewSignerClient(pve, chainID)
  947. if err != nil {
  948. return nil, fmt.Errorf("failed to start private validator: %w", err)
  949. }
  950. // try to get a pubkey from private validate first time
  951. _, err = pvsc.GetPubKey(context.TODO())
  952. if err != nil {
  953. return nil, fmt.Errorf("can't get pubkey: %w", err)
  954. }
  955. const (
  956. retries = 50 // 50 * 100ms = 5s total
  957. timeout = 100 * time.Millisecond
  958. )
  959. pvscWithRetries := privval.NewRetrySignerClient(pvsc, retries, timeout)
  960. return pvscWithRetries, nil
  961. }
  962. func createAndStartPrivValidatorGRPCClient(
  963. config *cfg.Config,
  964. chainID string,
  965. logger log.Logger,
  966. ) (types.PrivValidator, error) {
  967. pvsc, err := tmgrpc.DialRemoteSigner(
  968. config.PrivValidator,
  969. chainID,
  970. logger,
  971. config.Instrumentation.Prometheus,
  972. )
  973. if err != nil {
  974. return nil, fmt.Errorf("failed to start private validator: %w", err)
  975. }
  976. // try to get a pubkey from private validate first time
  977. _, err = pvsc.GetPubKey(context.TODO())
  978. if err != nil {
  979. return nil, fmt.Errorf("can't get pubkey: %w", err)
  980. }
  981. return pvsc, nil
  982. }
  983. func getRouterConfig(conf *cfg.Config, proxyApp proxy.AppConns) p2p.RouterOptions {
  984. opts := p2p.RouterOptions{
  985. QueueType: conf.P2P.QueueType,
  986. }
  987. if conf.P2P.MaxNumInboundPeers > 0 {
  988. opts.MaxIncomingConnectionAttempts = conf.P2P.MaxIncomingConnectionAttempts
  989. }
  990. if conf.FilterPeers && proxyApp != nil {
  991. opts.FilterPeerByID = func(ctx context.Context, id types.NodeID) error {
  992. res, err := proxyApp.Query().QuerySync(context.Background(), abci.RequestQuery{
  993. Path: fmt.Sprintf("/p2p/filter/id/%s", id),
  994. })
  995. if err != nil {
  996. return err
  997. }
  998. if res.IsErr() {
  999. return fmt.Errorf("error querying abci app: %v", res)
  1000. }
  1001. return nil
  1002. }
  1003. opts.FilterPeerByIP = func(ctx context.Context, ip net.IP, port uint16) error {
  1004. res, err := proxyApp.Query().QuerySync(ctx, abci.RequestQuery{
  1005. Path: fmt.Sprintf("/p2p/filter/addr/%s", net.JoinHostPort(ip.String(), strconv.Itoa(int(port)))),
  1006. })
  1007. if err != nil {
  1008. return err
  1009. }
  1010. if res.IsErr() {
  1011. return fmt.Errorf("error querying abci app: %v", res)
  1012. }
  1013. return nil
  1014. }
  1015. }
  1016. return opts
  1017. }
  1018. // FIXME: Temporary helper function, shims should be removed.
  1019. func makeChannelsFromShims(
  1020. router *p2p.Router,
  1021. chShims map[p2p.ChannelID]*p2p.ChannelDescriptorShim,
  1022. ) map[p2p.ChannelID]*p2p.Channel {
  1023. channels := map[p2p.ChannelID]*p2p.Channel{}
  1024. for chID, chShim := range chShims {
  1025. ch, err := router.OpenChannel(*chShim.Descriptor, chShim.MsgType, chShim.Descriptor.RecvBufferCapacity)
  1026. if err != nil {
  1027. panic(fmt.Sprintf("failed to open channel %v: %v", chID, err))
  1028. }
  1029. channels[chID] = ch
  1030. }
  1031. return channels
  1032. }
  1033. func getChannelsFromShim(reactorShim *p2p.ReactorShim) map[p2p.ChannelID]*p2p.Channel {
  1034. channels := map[p2p.ChannelID]*p2p.Channel{}
  1035. for chID := range reactorShim.Channels {
  1036. channels[chID] = reactorShim.GetChannel(chID)
  1037. }
  1038. return channels
  1039. }
  1040. func constructStateProvider(
  1041. ssc *cfg.StateSyncConfig,
  1042. state sm.State,
  1043. logger log.Logger,
  1044. ) (statesync.StateProvider, error) {
  1045. ctx, cancel := context.WithTimeout(context.TODO(), 10*time.Second)
  1046. defer cancel()
  1047. to := light.TrustOptions{
  1048. Period: ssc.TrustPeriod,
  1049. Height: ssc.TrustHeight,
  1050. Hash: ssc.TrustHashBytes(),
  1051. }
  1052. return statesync.NewLightClientStateProvider(
  1053. ctx,
  1054. state.ChainID, state.Version, state.InitialHeight,
  1055. ssc.RPCServers, to, logger,
  1056. )
  1057. }