You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1291 lines
40 KiB

privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
5 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
Close and retry a RemoteSigner on err (#2923) * Close and recreate a RemoteSigner on err * Update changelog * Address Anton's comments / suggestions: - update changelog - restart TCPVal - shut down on `ErrUnexpectedResponse` * re-init remote signer client with fresh connection if Ping fails - add/update TODOs in secret connection - rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose * account for `conn returned by waitConnection can be `nil` - also add TODO about RemoteSigner conn field * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn - add rwmutex for conn field in IPC * comments and doc.go * fix ipc tests. fixes #2677 * use constants for tests * cleanup some error statements * fixes #2784, race in tests * remove print statement * minor fixes from review * update comment on sts spec * cosmetics * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * IPCVal signer refactor - use a .reset() method - don't use embedded RemoteSignerClient - guard RemoteSignerClient with mutex - drop the .conn - expose Close() on RemoteSignerClient * apply IPCVal refactor to TCPVal * remove mtx from RemoteSignerClient * consolidate IPCVal and TCPVal, fixes #3104 - done in tcp_client.go - now called SocketVal - takes a listener in the constructor - make tcpListener and unixListener contain all the differences * delete ipc files * introduce unix and tcp dialer for RemoteSigner * rename files - drop tcp_ prefix - rename priv_validator.go to file.go * bring back listener options * fix node * fix priv_val_server * fix node test * minor cleanup and comments
5 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
  1. package node
  2. import (
  3. "context"
  4. "errors"
  5. "fmt"
  6. "net"
  7. "net/http"
  8. _ "net/http/pprof" // nolint: gosec // securely exposed on separate, optional port
  9. "strconv"
  10. "time"
  11. _ "github.com/lib/pq" // provide the psql db driver
  12. "github.com/prometheus/client_golang/prometheus"
  13. "github.com/prometheus/client_golang/prometheus/promhttp"
  14. "github.com/rs/cors"
  15. abci "github.com/tendermint/tendermint/abci/types"
  16. cfg "github.com/tendermint/tendermint/config"
  17. cs "github.com/tendermint/tendermint/consensus"
  18. "github.com/tendermint/tendermint/crypto"
  19. "github.com/tendermint/tendermint/evidence"
  20. tmjson "github.com/tendermint/tendermint/libs/json"
  21. "github.com/tendermint/tendermint/libs/log"
  22. tmnet "github.com/tendermint/tendermint/libs/net"
  23. tmpubsub "github.com/tendermint/tendermint/libs/pubsub"
  24. "github.com/tendermint/tendermint/libs/service"
  25. "github.com/tendermint/tendermint/libs/strings"
  26. "github.com/tendermint/tendermint/light"
  27. "github.com/tendermint/tendermint/mempool"
  28. "github.com/tendermint/tendermint/p2p"
  29. "github.com/tendermint/tendermint/p2p/pex"
  30. "github.com/tendermint/tendermint/privval"
  31. tmgrpc "github.com/tendermint/tendermint/privval/grpc"
  32. "github.com/tendermint/tendermint/proxy"
  33. rpccore "github.com/tendermint/tendermint/rpc/core"
  34. grpccore "github.com/tendermint/tendermint/rpc/grpc"
  35. rpcserver "github.com/tendermint/tendermint/rpc/jsonrpc/server"
  36. sm "github.com/tendermint/tendermint/state"
  37. "github.com/tendermint/tendermint/state/indexer"
  38. "github.com/tendermint/tendermint/statesync"
  39. "github.com/tendermint/tendermint/store"
  40. "github.com/tendermint/tendermint/types"
  41. tmtime "github.com/tendermint/tendermint/types/time"
  42. dbm "github.com/tendermint/tm-db"
  43. )
  44. // nodeImpl is the highest level interface to a full Tendermint node.
  45. // It includes all configuration information and running services.
  46. type nodeImpl struct {
  47. service.BaseService
  48. // config
  49. config *cfg.Config
  50. genesisDoc *types.GenesisDoc // initial validator set
  51. privValidator types.PrivValidator // local node's validator key
  52. // network
  53. transport *p2p.MConnTransport
  54. sw *p2p.Switch // p2p connections
  55. peerManager *p2p.PeerManager
  56. router *p2p.Router
  57. addrBook pex.AddrBook // known peers
  58. nodeInfo p2p.NodeInfo
  59. nodeKey p2p.NodeKey // our node privkey
  60. isListening bool
  61. // services
  62. eventBus *types.EventBus // pub/sub for services
  63. stateStore sm.Store
  64. blockStore *store.BlockStore // store the blockchain to disk
  65. bcReactor service.Service // for fast-syncing
  66. mempoolReactor service.Service // for gossipping transactions
  67. mempool mempool.Mempool
  68. stateSync bool // whether the node should state sync on startup
  69. stateSyncReactor *statesync.Reactor // for hosting and restoring state sync snapshots
  70. stateSyncProvider statesync.StateProvider // provides state data for bootstrapping a node
  71. stateSyncGenesis sm.State // provides the genesis state for state sync
  72. consensusState *cs.State // latest consensus state
  73. consensusReactor *cs.Reactor // for participating in the consensus
  74. pexReactor *pex.Reactor // for exchanging peer addresses
  75. pexReactorV2 *pex.ReactorV2 // for exchanging peer addresses
  76. evidenceReactor *evidence.Reactor
  77. evidencePool *evidence.Pool // tracking evidence
  78. proxyApp proxy.AppConns // connection to the application
  79. rpcListeners []net.Listener // rpc servers
  80. eventSinks []indexer.EventSink
  81. indexerService *indexer.Service
  82. prometheusSrv *http.Server
  83. }
  84. // newDefaultNode returns a Tendermint node with default settings for the
  85. // PrivValidator, ClientCreator, GenesisDoc, and DBProvider.
  86. // It implements NodeProvider.
  87. func newDefaultNode(config *cfg.Config, logger log.Logger) (service.Service, error) {
  88. nodeKey, err := p2p.LoadOrGenNodeKey(config.NodeKeyFile())
  89. if err != nil {
  90. return nil, fmt.Errorf("failed to load or gen node key %s: %w", config.NodeKeyFile(), err)
  91. }
  92. if config.Mode == cfg.ModeSeed {
  93. return makeSeedNode(config,
  94. cfg.DefaultDBProvider,
  95. nodeKey,
  96. defaultGenesisDocProviderFunc(config),
  97. logger,
  98. )
  99. }
  100. var pval *privval.FilePV
  101. if config.Mode == cfg.ModeValidator {
  102. pval, err = privval.LoadOrGenFilePV(config.PrivValidatorKeyFile(), config.PrivValidatorStateFile())
  103. if err != nil {
  104. return nil, err
  105. }
  106. } else {
  107. pval = nil
  108. }
  109. appClient, _ := proxy.DefaultClientCreator(config.ProxyApp, config.ABCI, config.DBDir())
  110. return makeNode(config,
  111. pval,
  112. nodeKey,
  113. appClient,
  114. defaultGenesisDocProviderFunc(config),
  115. cfg.DefaultDBProvider,
  116. logger,
  117. )
  118. }
  119. // makeNode returns a new, ready to go, Tendermint Node.
  120. func makeNode(config *cfg.Config,
  121. privValidator types.PrivValidator,
  122. nodeKey p2p.NodeKey,
  123. clientCreator proxy.ClientCreator,
  124. genesisDocProvider genesisDocProvider,
  125. dbProvider cfg.DBProvider,
  126. logger log.Logger) (service.Service, error) {
  127. blockStore, stateDB, err := initDBs(config, dbProvider)
  128. if err != nil {
  129. return nil, err
  130. }
  131. stateStore := sm.NewStore(stateDB)
  132. state, genDoc, err := loadStateFromDBOrGenesisDocProvider(stateDB, genesisDocProvider)
  133. if err != nil {
  134. return nil, err
  135. }
  136. // Create the proxyApp and establish connections to the ABCI app (consensus, mempool, query).
  137. proxyApp, err := createAndStartProxyAppConns(clientCreator, logger)
  138. if err != nil {
  139. return nil, err
  140. }
  141. // EventBus and IndexerService must be started before the handshake because
  142. // we might need to index the txs of the replayed block as this might not have happened
  143. // when the node stopped last time (i.e. the node stopped after it saved the block
  144. // but before it indexed the txs, or, endblocker panicked)
  145. eventBus, err := createAndStartEventBus(logger)
  146. if err != nil {
  147. return nil, err
  148. }
  149. indexerService, eventSinks, err := createAndStartIndexerService(config, dbProvider, eventBus, logger, genDoc.ChainID)
  150. if err != nil {
  151. return nil, err
  152. }
  153. // If an address is provided, listen on the socket for a connection from an
  154. // external signing process.
  155. if config.PrivValidator.ListenAddr != "" {
  156. protocol, _ := tmnet.ProtocolAndAddress(config.PrivValidator.ListenAddr)
  157. // FIXME: we should start services inside OnStart
  158. switch protocol {
  159. case "grpc":
  160. privValidator, err = createAndStartPrivValidatorGRPCClient(config, genDoc.ChainID, logger)
  161. if err != nil {
  162. return nil, fmt.Errorf("error with private validator grpc client: %w", err)
  163. }
  164. default:
  165. privValidator, err = createAndStartPrivValidatorSocketClient(config.PrivValidator.ListenAddr, genDoc.ChainID, logger)
  166. if err != nil {
  167. return nil, fmt.Errorf("error with private validator socket client: %w", err)
  168. }
  169. }
  170. }
  171. var pubKey crypto.PubKey
  172. if config.Mode == cfg.ModeValidator {
  173. pubKey, err = privValidator.GetPubKey(context.TODO())
  174. if err != nil {
  175. return nil, fmt.Errorf("can't get pubkey: %w", err)
  176. }
  177. if pubKey == nil {
  178. return nil, errors.New("could not retrieve public key from private validator")
  179. }
  180. }
  181. // Determine whether we should attempt state sync.
  182. stateSync := config.StateSync.Enable && !onlyValidatorIsUs(state, pubKey)
  183. if stateSync && state.LastBlockHeight > 0 {
  184. logger.Info("Found local state with non-zero height, skipping state sync")
  185. stateSync = false
  186. }
  187. // Create the handshaker, which calls RequestInfo, sets the AppVersion on the state,
  188. // and replays any blocks as necessary to sync tendermint with the app.
  189. consensusLogger := logger.With("module", "consensus")
  190. if !stateSync {
  191. if err := doHandshake(stateStore, state, blockStore, genDoc, eventBus, proxyApp, consensusLogger); err != nil {
  192. return nil, err
  193. }
  194. // Reload the state. It will have the Version.Consensus.App set by the
  195. // Handshake, and may have other modifications as well (ie. depending on
  196. // what happened during block replay).
  197. state, err = stateStore.Load()
  198. if err != nil {
  199. return nil, fmt.Errorf("cannot load state: %w", err)
  200. }
  201. }
  202. // Determine whether we should do fast sync. This must happen after the handshake, since the
  203. // app may modify the validator set, specifying ourself as the only validator.
  204. fastSync := config.FastSyncMode && !onlyValidatorIsUs(state, pubKey)
  205. logNodeStartupInfo(state, pubKey, logger, consensusLogger, config.Mode)
  206. // TODO: Fetch and provide real options and do proper p2p bootstrapping.
  207. // TODO: Use a persistent peer database.
  208. nodeInfo, err := makeNodeInfo(config, nodeKey, eventSinks, genDoc, state)
  209. if err != nil {
  210. return nil, err
  211. }
  212. p2pLogger := logger.With("module", "p2p")
  213. transport := createTransport(p2pLogger, config)
  214. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  215. if err != nil {
  216. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  217. }
  218. csMetrics, p2pMetrics, memplMetrics, smMetrics := defaultMetricsProvider(config.Instrumentation)(genDoc.ChainID)
  219. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  220. peerManager, transport, getRouterConfig(config, proxyApp))
  221. if err != nil {
  222. return nil, fmt.Errorf("failed to create router: %w", err)
  223. }
  224. mpReactorShim, mpReactor, mp, err := createMempoolReactor(
  225. config, proxyApp, state, memplMetrics, peerManager, router, logger,
  226. )
  227. if err != nil {
  228. return nil, err
  229. }
  230. evReactorShim, evReactor, evPool, err := createEvidenceReactor(
  231. config, dbProvider, stateDB, blockStore, peerManager, router, logger,
  232. )
  233. if err != nil {
  234. return nil, err
  235. }
  236. // make block executor for consensus and blockchain reactors to execute blocks
  237. blockExec := sm.NewBlockExecutor(
  238. stateStore,
  239. logger.With("module", "state"),
  240. proxyApp.Consensus(),
  241. mp,
  242. evPool,
  243. sm.BlockExecutorWithMetrics(smMetrics),
  244. )
  245. csReactorShim, csReactor, csState := createConsensusReactor(
  246. config, state, blockExec, blockStore, mp, evPool,
  247. privValidator, csMetrics, stateSync || fastSync, eventBus,
  248. peerManager, router, consensusLogger,
  249. )
  250. // Create the blockchain reactor. Note, we do not start fast sync if we're
  251. // doing a state sync first.
  252. bcReactorShim, bcReactor, err := createBlockchainReactor(
  253. logger, config, state, blockExec, blockStore, csReactor,
  254. peerManager, router, fastSync && !stateSync,
  255. )
  256. if err != nil {
  257. return nil, fmt.Errorf("could not create blockchain reactor: %w", err)
  258. }
  259. // TODO: Remove this once the switch is removed.
  260. var bcReactorForSwitch p2p.Reactor
  261. if bcReactorShim != nil {
  262. bcReactorForSwitch = bcReactorShim
  263. } else {
  264. bcReactorForSwitch = bcReactor.(p2p.Reactor)
  265. }
  266. // Make ConsensusReactor. Don't enable fully if doing a state sync and/or fast sync first.
  267. // FIXME We need to update metrics here, since other reactors don't have access to them.
  268. if stateSync {
  269. csMetrics.StateSyncing.Set(1)
  270. } else if fastSync {
  271. csMetrics.FastSyncing.Set(1)
  272. }
  273. // Set up state sync reactor, and schedule a sync if requested.
  274. // FIXME The way we do phased startups (e.g. replay -> fast sync -> consensus) is very messy,
  275. // we should clean this whole thing up. See:
  276. // https://github.com/tendermint/tendermint/issues/4644
  277. var (
  278. stateSyncReactor *statesync.Reactor
  279. stateSyncReactorShim *p2p.ReactorShim
  280. channels map[p2p.ChannelID]*p2p.Channel
  281. peerUpdates *p2p.PeerUpdates
  282. )
  283. stateSyncReactorShim = p2p.NewReactorShim(logger.With("module", "statesync"), "StateSyncShim", statesync.ChannelShims)
  284. if config.P2P.DisableLegacy {
  285. channels = makeChannelsFromShims(router, statesync.ChannelShims)
  286. peerUpdates = peerManager.Subscribe()
  287. } else {
  288. channels = getChannelsFromShim(stateSyncReactorShim)
  289. peerUpdates = stateSyncReactorShim.PeerUpdates
  290. }
  291. stateSyncReactor = statesync.NewReactor(
  292. stateSyncReactorShim.Logger,
  293. proxyApp.Snapshot(),
  294. proxyApp.Query(),
  295. channels[statesync.SnapshotChannel],
  296. channels[statesync.ChunkChannel],
  297. peerUpdates,
  298. config.StateSync.TempDir,
  299. )
  300. // add the channel descriptors to both the transports
  301. // FIXME: This should be removed when the legacy p2p stack is removed and
  302. // transports can either be agnostic to channel descriptors or can be
  303. // declared in the constructor.
  304. transport.AddChannelDescriptors(mpReactorShim.GetChannels())
  305. transport.AddChannelDescriptors(bcReactorForSwitch.GetChannels())
  306. transport.AddChannelDescriptors(csReactorShim.GetChannels())
  307. transport.AddChannelDescriptors(evReactorShim.GetChannels())
  308. transport.AddChannelDescriptors(stateSyncReactorShim.GetChannels())
  309. // Optionally, start the pex reactor
  310. //
  311. // TODO:
  312. //
  313. // We need to set Seeds and PersistentPeers on the switch,
  314. // since it needs to be able to use these (and their DNS names)
  315. // even if the PEX is off. We can include the DNS name in the NetAddress,
  316. // but it would still be nice to have a clear list of the current "PersistentPeers"
  317. // somewhere that we can return with net_info.
  318. //
  319. // If PEX is on, it should handle dialing the seeds. Otherwise the switch does it.
  320. // Note we currently use the addrBook regardless at least for AddOurAddress
  321. var (
  322. pexReactor *pex.Reactor
  323. pexReactorV2 *pex.ReactorV2
  324. sw *p2p.Switch
  325. addrBook pex.AddrBook
  326. )
  327. pexCh := pex.ChannelDescriptor()
  328. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  329. if config.P2P.DisableLegacy {
  330. addrBook = nil
  331. pexReactorV2, err = createPEXReactorV2(config, logger, peerManager, router)
  332. if err != nil {
  333. return nil, err
  334. }
  335. } else {
  336. // setup Transport and Switch
  337. sw = createSwitch(
  338. config, transport, p2pMetrics, mpReactorShim, bcReactorForSwitch,
  339. stateSyncReactorShim, csReactorShim, evReactorShim, proxyApp, nodeInfo, nodeKey, p2pLogger,
  340. )
  341. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  342. if err != nil {
  343. return nil, fmt.Errorf("could not add peers from persistent-peers field: %w", err)
  344. }
  345. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  346. if err != nil {
  347. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  348. }
  349. addrBook, err = createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  350. if err != nil {
  351. return nil, fmt.Errorf("could not create addrbook: %w", err)
  352. }
  353. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  354. }
  355. if config.RPC.PprofListenAddress != "" {
  356. go func() {
  357. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  358. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  359. }()
  360. }
  361. node := &nodeImpl{
  362. config: config,
  363. genesisDoc: genDoc,
  364. privValidator: privValidator,
  365. transport: transport,
  366. sw: sw,
  367. peerManager: peerManager,
  368. router: router,
  369. addrBook: addrBook,
  370. nodeInfo: nodeInfo,
  371. nodeKey: nodeKey,
  372. stateStore: stateStore,
  373. blockStore: blockStore,
  374. bcReactor: bcReactor,
  375. mempoolReactor: mpReactor,
  376. mempool: mp,
  377. consensusState: csState,
  378. consensusReactor: csReactor,
  379. stateSyncReactor: stateSyncReactor,
  380. stateSync: stateSync,
  381. stateSyncGenesis: state, // Shouldn't be necessary, but need a way to pass the genesis state
  382. pexReactor: pexReactor,
  383. pexReactorV2: pexReactorV2,
  384. evidenceReactor: evReactor,
  385. evidencePool: evPool,
  386. proxyApp: proxyApp,
  387. indexerService: indexerService,
  388. eventBus: eventBus,
  389. eventSinks: eventSinks,
  390. }
  391. node.BaseService = *service.NewBaseService(logger, "Node", node)
  392. return node, nil
  393. }
  394. // makeSeedNode returns a new seed node, containing only p2p, pex reactor
  395. func makeSeedNode(config *cfg.Config,
  396. dbProvider cfg.DBProvider,
  397. nodeKey p2p.NodeKey,
  398. genesisDocProvider genesisDocProvider,
  399. logger log.Logger,
  400. ) (service.Service, error) {
  401. genDoc, err := genesisDocProvider()
  402. if err != nil {
  403. return nil, err
  404. }
  405. state, err := sm.MakeGenesisState(genDoc)
  406. if err != nil {
  407. return nil, err
  408. }
  409. nodeInfo, err := makeSeedNodeInfo(config, nodeKey, genDoc, state)
  410. if err != nil {
  411. return nil, err
  412. }
  413. // Setup Transport and Switch.
  414. p2pMetrics := p2p.PrometheusMetrics(config.Instrumentation.Namespace, "chain_id", genDoc.ChainID)
  415. p2pLogger := logger.With("module", "p2p")
  416. transport := createTransport(p2pLogger, config)
  417. sw := createSwitch(
  418. config, transport, p2pMetrics, nil, nil,
  419. nil, nil, nil, nil, nodeInfo, nodeKey, p2pLogger,
  420. )
  421. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  422. if err != nil {
  423. return nil, fmt.Errorf("could not add peers from persistent_peers field: %w", err)
  424. }
  425. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  426. if err != nil {
  427. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  428. }
  429. addrBook, err := createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  430. if err != nil {
  431. return nil, fmt.Errorf("could not create addrbook: %w", err)
  432. }
  433. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  434. if err != nil {
  435. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  436. }
  437. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  438. peerManager, transport, getRouterConfig(config, nil))
  439. if err != nil {
  440. return nil, fmt.Errorf("failed to create router: %w", err)
  441. }
  442. var (
  443. pexReactor *pex.Reactor
  444. pexReactorV2 *pex.ReactorV2
  445. )
  446. // add the pex reactor
  447. // FIXME: we add channel descriptors to both the router and the transport but only the router
  448. // should be aware of channel info. We should remove this from transport once the legacy
  449. // p2p stack is removed.
  450. pexCh := pex.ChannelDescriptor()
  451. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  452. if config.P2P.DisableLegacy {
  453. pexReactorV2, err = createPEXReactorV2(config, logger, peerManager, router)
  454. if err != nil {
  455. return nil, err
  456. }
  457. } else {
  458. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  459. }
  460. if config.RPC.PprofListenAddress != "" {
  461. go func() {
  462. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  463. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  464. }()
  465. }
  466. node := &nodeImpl{
  467. config: config,
  468. genesisDoc: genDoc,
  469. transport: transport,
  470. sw: sw,
  471. addrBook: addrBook,
  472. nodeInfo: nodeInfo,
  473. nodeKey: nodeKey,
  474. peerManager: peerManager,
  475. router: router,
  476. pexReactor: pexReactor,
  477. pexReactorV2: pexReactorV2,
  478. }
  479. node.BaseService = *service.NewBaseService(logger, "SeedNode", node)
  480. return node, nil
  481. }
  482. // Temporary interface for switching to fast sync, we should get rid of v0.
  483. // See: https://github.com/tendermint/tendermint/issues/4595
  484. type fastSyncReactor interface {
  485. SwitchToFastSync(sm.State) error
  486. }
  487. // OnStart starts the Node. It implements service.Service.
  488. func (n *nodeImpl) OnStart() error {
  489. now := tmtime.Now()
  490. genTime := n.genesisDoc.GenesisTime
  491. if genTime.After(now) {
  492. n.Logger.Info("Genesis time is in the future. Sleeping until then...", "genTime", genTime)
  493. time.Sleep(genTime.Sub(now))
  494. }
  495. // Start the RPC server before the P2P server
  496. // so we can eg. receive txs for the first block
  497. if n.config.RPC.ListenAddress != "" && n.config.Mode != cfg.ModeSeed {
  498. listeners, err := n.startRPC()
  499. if err != nil {
  500. return err
  501. }
  502. n.rpcListeners = listeners
  503. }
  504. if n.config.Instrumentation.Prometheus &&
  505. n.config.Instrumentation.PrometheusListenAddr != "" {
  506. n.prometheusSrv = n.startPrometheusServer(n.config.Instrumentation.PrometheusListenAddr)
  507. }
  508. // Start the transport.
  509. addr, err := p2p.NewNetAddressString(p2p.IDAddressString(n.nodeKey.ID, n.config.P2P.ListenAddress))
  510. if err != nil {
  511. return err
  512. }
  513. if err := n.transport.Listen(addr.Endpoint()); err != nil {
  514. return err
  515. }
  516. n.isListening = true
  517. n.Logger.Info("p2p service", "legacy_enabled", !n.config.P2P.DisableLegacy)
  518. if n.config.P2P.DisableLegacy {
  519. err = n.router.Start()
  520. } else {
  521. // Add private IDs to addrbook to block those peers being added
  522. n.addrBook.AddPrivateIDs(strings.SplitAndTrimEmpty(n.config.P2P.PrivatePeerIDs, ",", " "))
  523. err = n.sw.Start()
  524. }
  525. if err != nil {
  526. return err
  527. }
  528. if n.config.Mode != cfg.ModeSeed {
  529. if n.config.FastSync.Version == cfg.BlockchainV0 {
  530. // Start the real blockchain reactor separately since the switch uses the shim.
  531. if err := n.bcReactor.Start(); err != nil {
  532. return err
  533. }
  534. }
  535. // Start the real consensus reactor separately since the switch uses the shim.
  536. if err := n.consensusReactor.Start(); err != nil {
  537. return err
  538. }
  539. // Start the real state sync reactor separately since the switch uses the shim.
  540. if err := n.stateSyncReactor.Start(); err != nil {
  541. return err
  542. }
  543. // Start the real mempool reactor separately since the switch uses the shim.
  544. if err := n.mempoolReactor.Start(); err != nil {
  545. return err
  546. }
  547. // Start the real evidence reactor separately since the switch uses the shim.
  548. if err := n.evidenceReactor.Start(); err != nil {
  549. return err
  550. }
  551. }
  552. if n.config.P2P.DisableLegacy && n.pexReactorV2 != nil {
  553. if err := n.pexReactorV2.Start(); err != nil {
  554. return err
  555. }
  556. } else {
  557. // Always connect to persistent peers
  558. err = n.sw.DialPeersAsync(strings.SplitAndTrimEmpty(n.config.P2P.PersistentPeers, ",", " "))
  559. if err != nil {
  560. return fmt.Errorf("could not dial peers from persistent-peers field: %w", err)
  561. }
  562. }
  563. // Run state sync
  564. if n.stateSync {
  565. bcR, ok := n.bcReactor.(fastSyncReactor)
  566. if !ok {
  567. return fmt.Errorf("this blockchain reactor does not support switching from state sync")
  568. }
  569. err := startStateSync(n.stateSyncReactor, bcR, n.consensusReactor, n.stateSyncProvider,
  570. n.config.StateSync, n.config.FastSyncMode, n.stateStore, n.blockStore, n.stateSyncGenesis)
  571. if err != nil {
  572. return fmt.Errorf("failed to start state sync: %w", err)
  573. }
  574. }
  575. return nil
  576. }
  577. // OnStop stops the Node. It implements service.Service.
  578. func (n *nodeImpl) OnStop() {
  579. n.Logger.Info("Stopping Node")
  580. // first stop the non-reactor services
  581. if err := n.eventBus.Stop(); err != nil {
  582. n.Logger.Error("Error closing eventBus", "err", err)
  583. }
  584. if err := n.indexerService.Stop(); err != nil {
  585. n.Logger.Error("Error closing indexerService", "err", err)
  586. }
  587. if n.config.Mode != cfg.ModeSeed {
  588. // now stop the reactors
  589. if n.config.FastSync.Version == cfg.BlockchainV0 {
  590. // Stop the real blockchain reactor separately since the switch uses the shim.
  591. if err := n.bcReactor.Stop(); err != nil {
  592. n.Logger.Error("failed to stop the blockchain reactor", "err", err)
  593. }
  594. }
  595. // Stop the real consensus reactor separately since the switch uses the shim.
  596. if err := n.consensusReactor.Stop(); err != nil {
  597. n.Logger.Error("failed to stop the consensus reactor", "err", err)
  598. }
  599. // Stop the real state sync reactor separately since the switch uses the shim.
  600. if err := n.stateSyncReactor.Stop(); err != nil {
  601. n.Logger.Error("failed to stop the state sync reactor", "err", err)
  602. }
  603. // Stop the real mempool reactor separately since the switch uses the shim.
  604. if err := n.mempoolReactor.Stop(); err != nil {
  605. n.Logger.Error("failed to stop the mempool reactor", "err", err)
  606. }
  607. // Stop the real evidence reactor separately since the switch uses the shim.
  608. if err := n.evidenceReactor.Stop(); err != nil {
  609. n.Logger.Error("failed to stop the evidence reactor", "err", err)
  610. }
  611. }
  612. if n.config.P2P.DisableLegacy && n.pexReactorV2 != nil {
  613. if err := n.pexReactorV2.Stop(); err != nil {
  614. n.Logger.Error("failed to stop the PEX v2 reactor", "err", err)
  615. }
  616. }
  617. if n.config.P2P.DisableLegacy {
  618. if err := n.router.Stop(); err != nil {
  619. n.Logger.Error("failed to stop router", "err", err)
  620. }
  621. } else {
  622. if err := n.sw.Stop(); err != nil {
  623. n.Logger.Error("failed to stop switch", "err", err)
  624. }
  625. }
  626. if err := n.transport.Close(); err != nil {
  627. n.Logger.Error("Error closing transport", "err", err)
  628. }
  629. n.isListening = false
  630. // finally stop the listeners / external services
  631. for _, l := range n.rpcListeners {
  632. n.Logger.Info("Closing rpc listener", "listener", l)
  633. if err := l.Close(); err != nil {
  634. n.Logger.Error("Error closing listener", "listener", l, "err", err)
  635. }
  636. }
  637. if pvsc, ok := n.privValidator.(service.Service); ok {
  638. if err := pvsc.Stop(); err != nil {
  639. n.Logger.Error("Error closing private validator", "err", err)
  640. }
  641. }
  642. if n.prometheusSrv != nil {
  643. if err := n.prometheusSrv.Shutdown(context.Background()); err != nil {
  644. // Error from closing listeners, or context timeout:
  645. n.Logger.Error("Prometheus HTTP server Shutdown", "err", err)
  646. }
  647. }
  648. }
  649. // ConfigureRPC makes sure RPC has all the objects it needs to operate.
  650. func (n *nodeImpl) ConfigureRPC() (*rpccore.Environment, error) {
  651. rpcCoreEnv := rpccore.Environment{
  652. ProxyAppQuery: n.proxyApp.Query(),
  653. ProxyAppMempool: n.proxyApp.Mempool(),
  654. StateStore: n.stateStore,
  655. BlockStore: n.blockStore,
  656. EvidencePool: n.evidencePool,
  657. ConsensusState: n.consensusState,
  658. P2PPeers: n.sw,
  659. P2PTransport: n,
  660. GenDoc: n.genesisDoc,
  661. EventSinks: n.eventSinks,
  662. ConsensusReactor: n.consensusReactor,
  663. EventBus: n.eventBus,
  664. Mempool: n.mempool,
  665. Logger: n.Logger.With("module", "rpc"),
  666. Config: *n.config.RPC,
  667. }
  668. if n.config.Mode == cfg.ModeValidator {
  669. pubKey, err := n.privValidator.GetPubKey(context.TODO())
  670. if pubKey == nil || err != nil {
  671. return nil, fmt.Errorf("can't get pubkey: %w", err)
  672. }
  673. rpcCoreEnv.PubKey = pubKey
  674. }
  675. if err := rpcCoreEnv.InitGenesisChunks(); err != nil {
  676. return nil, err
  677. }
  678. return &rpcCoreEnv, nil
  679. }
  680. func (n *nodeImpl) startRPC() ([]net.Listener, error) {
  681. env, err := n.ConfigureRPC()
  682. if err != nil {
  683. return nil, err
  684. }
  685. listenAddrs := strings.SplitAndTrimEmpty(n.config.RPC.ListenAddress, ",", " ")
  686. routes := env.GetRoutes()
  687. if n.config.RPC.Unsafe {
  688. env.AddUnsafe(routes)
  689. }
  690. config := rpcserver.DefaultConfig()
  691. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  692. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  693. config.MaxOpenConnections = n.config.RPC.MaxOpenConnections
  694. // If necessary adjust global WriteTimeout to ensure it's greater than
  695. // TimeoutBroadcastTxCommit.
  696. // See https://github.com/tendermint/tendermint/issues/3435
  697. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  698. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  699. }
  700. // we may expose the rpc over both a unix and tcp socket
  701. listeners := make([]net.Listener, len(listenAddrs))
  702. for i, listenAddr := range listenAddrs {
  703. mux := http.NewServeMux()
  704. rpcLogger := n.Logger.With("module", "rpc-server")
  705. wmLogger := rpcLogger.With("protocol", "websocket")
  706. wm := rpcserver.NewWebsocketManager(routes,
  707. rpcserver.OnDisconnect(func(remoteAddr string) {
  708. err := n.eventBus.UnsubscribeAll(context.Background(), remoteAddr)
  709. if err != nil && err != tmpubsub.ErrSubscriptionNotFound {
  710. wmLogger.Error("Failed to unsubscribe addr from events", "addr", remoteAddr, "err", err)
  711. }
  712. }),
  713. rpcserver.ReadLimit(config.MaxBodyBytes),
  714. )
  715. wm.SetLogger(wmLogger)
  716. mux.HandleFunc("/websocket", wm.WebsocketHandler)
  717. rpcserver.RegisterRPCFuncs(mux, routes, rpcLogger)
  718. listener, err := rpcserver.Listen(
  719. listenAddr,
  720. config,
  721. )
  722. if err != nil {
  723. return nil, err
  724. }
  725. var rootHandler http.Handler = mux
  726. if n.config.RPC.IsCorsEnabled() {
  727. corsMiddleware := cors.New(cors.Options{
  728. AllowedOrigins: n.config.RPC.CORSAllowedOrigins,
  729. AllowedMethods: n.config.RPC.CORSAllowedMethods,
  730. AllowedHeaders: n.config.RPC.CORSAllowedHeaders,
  731. })
  732. rootHandler = corsMiddleware.Handler(mux)
  733. }
  734. if n.config.RPC.IsTLSEnabled() {
  735. go func() {
  736. if err := rpcserver.ServeTLS(
  737. listener,
  738. rootHandler,
  739. n.config.RPC.CertFile(),
  740. n.config.RPC.KeyFile(),
  741. rpcLogger,
  742. config,
  743. ); err != nil {
  744. n.Logger.Error("Error serving server with TLS", "err", err)
  745. }
  746. }()
  747. } else {
  748. go func() {
  749. if err := rpcserver.Serve(
  750. listener,
  751. rootHandler,
  752. rpcLogger,
  753. config,
  754. ); err != nil {
  755. n.Logger.Error("Error serving server", "err", err)
  756. }
  757. }()
  758. }
  759. listeners[i] = listener
  760. }
  761. // we expose a simplified api over grpc for convenience to app devs
  762. grpcListenAddr := n.config.RPC.GRPCListenAddress
  763. if grpcListenAddr != "" {
  764. config := rpcserver.DefaultConfig()
  765. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  766. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  767. // NOTE: GRPCMaxOpenConnections is used, not MaxOpenConnections
  768. config.MaxOpenConnections = n.config.RPC.GRPCMaxOpenConnections
  769. // If necessary adjust global WriteTimeout to ensure it's greater than
  770. // TimeoutBroadcastTxCommit.
  771. // See https://github.com/tendermint/tendermint/issues/3435
  772. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  773. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  774. }
  775. listener, err := rpcserver.Listen(grpcListenAddr, config)
  776. if err != nil {
  777. return nil, err
  778. }
  779. go func() {
  780. if err := grpccore.StartGRPCServer(env, listener); err != nil {
  781. n.Logger.Error("Error starting gRPC server", "err", err)
  782. }
  783. }()
  784. listeners = append(listeners, listener)
  785. }
  786. return listeners, nil
  787. }
  788. // startPrometheusServer starts a Prometheus HTTP server, listening for metrics
  789. // collectors on addr.
  790. func (n *nodeImpl) startPrometheusServer(addr string) *http.Server {
  791. srv := &http.Server{
  792. Addr: addr,
  793. Handler: promhttp.InstrumentMetricHandler(
  794. prometheus.DefaultRegisterer, promhttp.HandlerFor(
  795. prometheus.DefaultGatherer,
  796. promhttp.HandlerOpts{MaxRequestsInFlight: n.config.Instrumentation.MaxOpenConnections},
  797. ),
  798. ),
  799. }
  800. go func() {
  801. if err := srv.ListenAndServe(); err != http.ErrServerClosed {
  802. // Error starting or closing listener:
  803. n.Logger.Error("Prometheus HTTP server ListenAndServe", "err", err)
  804. }
  805. }()
  806. return srv
  807. }
  808. // Switch returns the Node's Switch.
  809. func (n *nodeImpl) Switch() *p2p.Switch {
  810. return n.sw
  811. }
  812. // BlockStore returns the Node's BlockStore.
  813. func (n *nodeImpl) BlockStore() *store.BlockStore {
  814. return n.blockStore
  815. }
  816. // ConsensusState returns the Node's ConsensusState.
  817. func (n *nodeImpl) ConsensusState() *cs.State {
  818. return n.consensusState
  819. }
  820. // ConsensusReactor returns the Node's ConsensusReactor.
  821. func (n *nodeImpl) ConsensusReactor() *cs.Reactor {
  822. return n.consensusReactor
  823. }
  824. // MempoolReactor returns the Node's mempool reactor.
  825. func (n *nodeImpl) MempoolReactor() service.Service {
  826. return n.mempoolReactor
  827. }
  828. // Mempool returns the Node's mempool.
  829. func (n *nodeImpl) Mempool() mempool.Mempool {
  830. return n.mempool
  831. }
  832. // PEXReactor returns the Node's PEXReactor. It returns nil if PEX is disabled.
  833. func (n *nodeImpl) PEXReactor() *pex.Reactor {
  834. return n.pexReactor
  835. }
  836. // EvidencePool returns the Node's EvidencePool.
  837. func (n *nodeImpl) EvidencePool() *evidence.Pool {
  838. return n.evidencePool
  839. }
  840. // EventBus returns the Node's EventBus.
  841. func (n *nodeImpl) EventBus() *types.EventBus {
  842. return n.eventBus
  843. }
  844. // PrivValidator returns the Node's PrivValidator.
  845. // XXX: for convenience only!
  846. func (n *nodeImpl) PrivValidator() types.PrivValidator {
  847. return n.privValidator
  848. }
  849. // GenesisDoc returns the Node's GenesisDoc.
  850. func (n *nodeImpl) GenesisDoc() *types.GenesisDoc {
  851. return n.genesisDoc
  852. }
  853. // ProxyApp returns the Node's AppConns, representing its connections to the ABCI application.
  854. func (n *nodeImpl) ProxyApp() proxy.AppConns {
  855. return n.proxyApp
  856. }
  857. // Config returns the Node's config.
  858. func (n *nodeImpl) Config() *cfg.Config {
  859. return n.config
  860. }
  861. // EventSinks returns the Node's event indexing sinks.
  862. func (n *nodeImpl) EventSinks() []indexer.EventSink {
  863. return n.eventSinks
  864. }
  865. //------------------------------------------------------------------------------
  866. func (n *nodeImpl) Listeners() []string {
  867. return []string{
  868. fmt.Sprintf("Listener(@%v)", n.config.P2P.ExternalAddress),
  869. }
  870. }
  871. func (n *nodeImpl) IsListening() bool {
  872. return n.isListening
  873. }
  874. // NodeInfo returns the Node's Info from the Switch.
  875. func (n *nodeImpl) NodeInfo() p2p.NodeInfo {
  876. return n.nodeInfo
  877. }
  878. // startStateSync starts an asynchronous state sync process, then switches to fast sync mode.
  879. func startStateSync(ssR *statesync.Reactor, bcR fastSyncReactor, conR *cs.Reactor,
  880. stateProvider statesync.StateProvider, config *cfg.StateSyncConfig, fastSync bool,
  881. stateStore sm.Store, blockStore *store.BlockStore, state sm.State) error {
  882. ssR.Logger.Info("Starting state sync")
  883. if stateProvider == nil {
  884. var err error
  885. ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
  886. defer cancel()
  887. stateProvider, err = statesync.NewLightClientStateProvider(
  888. ctx,
  889. state.ChainID, state.Version, state.InitialHeight,
  890. config.RPCServers, light.TrustOptions{
  891. Period: config.TrustPeriod,
  892. Height: config.TrustHeight,
  893. Hash: config.TrustHashBytes(),
  894. }, ssR.Logger.With("module", "light"))
  895. if err != nil {
  896. return fmt.Errorf("failed to set up light client state provider: %w", err)
  897. }
  898. }
  899. go func() {
  900. state, commit, err := ssR.Sync(stateProvider, config.DiscoveryTime)
  901. if err != nil {
  902. ssR.Logger.Error("State sync failed", "err", err)
  903. return
  904. }
  905. err = stateStore.Bootstrap(state)
  906. if err != nil {
  907. ssR.Logger.Error("Failed to bootstrap node with new state", "err", err)
  908. return
  909. }
  910. err = blockStore.SaveSeenCommit(state.LastBlockHeight, commit)
  911. if err != nil {
  912. ssR.Logger.Error("Failed to store last seen commit", "err", err)
  913. return
  914. }
  915. if fastSync {
  916. // FIXME Very ugly to have these metrics bleed through here.
  917. conR.Metrics.StateSyncing.Set(0)
  918. conR.Metrics.FastSyncing.Set(1)
  919. err = bcR.SwitchToFastSync(state)
  920. if err != nil {
  921. ssR.Logger.Error("Failed to switch to fast sync", "err", err)
  922. return
  923. }
  924. } else {
  925. conR.SwitchToConsensus(state, true)
  926. }
  927. }()
  928. return nil
  929. }
  930. // genesisDocProvider returns a GenesisDoc.
  931. // It allows the GenesisDoc to be pulled from sources other than the
  932. // filesystem, for instance from a distributed key-value store cluster.
  933. type genesisDocProvider func() (*types.GenesisDoc, error)
  934. // defaultGenesisDocProviderFunc returns a GenesisDocProvider that loads
  935. // the GenesisDoc from the config.GenesisFile() on the filesystem.
  936. func defaultGenesisDocProviderFunc(config *cfg.Config) genesisDocProvider {
  937. return func() (*types.GenesisDoc, error) {
  938. return types.GenesisDocFromFile(config.GenesisFile())
  939. }
  940. }
  941. // metricsProvider returns a consensus, p2p and mempool Metrics.
  942. type metricsProvider func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics)
  943. // defaultMetricsProvider returns Metrics build using Prometheus client library
  944. // if Prometheus is enabled. Otherwise, it returns no-op Metrics.
  945. func defaultMetricsProvider(config *cfg.InstrumentationConfig) metricsProvider {
  946. return func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics) {
  947. if config.Prometheus {
  948. return cs.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  949. p2p.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  950. mempool.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  951. sm.PrometheusMetrics(config.Namespace, "chain_id", chainID)
  952. }
  953. return cs.NopMetrics(), p2p.NopMetrics(), mempool.NopMetrics(), sm.NopMetrics()
  954. }
  955. }
  956. //------------------------------------------------------------------------------
  957. var (
  958. genesisDocKey = []byte("genesisDoc")
  959. )
  960. // loadStateFromDBOrGenesisDocProvider attempts to load the state from the
  961. // database, or creates one using the given genesisDocProvider. On success this also
  962. // returns the genesis doc loaded through the given provider.
  963. func loadStateFromDBOrGenesisDocProvider(
  964. stateDB dbm.DB,
  965. genesisDocProvider genesisDocProvider,
  966. ) (sm.State, *types.GenesisDoc, error) {
  967. // Get genesis doc
  968. genDoc, err := loadGenesisDoc(stateDB)
  969. if err != nil {
  970. genDoc, err = genesisDocProvider()
  971. if err != nil {
  972. return sm.State{}, nil, err
  973. }
  974. err = genDoc.ValidateAndComplete()
  975. if err != nil {
  976. return sm.State{}, nil, fmt.Errorf("error in genesis doc: %w", err)
  977. }
  978. // save genesis doc to prevent a certain class of user errors (e.g. when it
  979. // was changed, accidentally or not). Also good for audit trail.
  980. if err := saveGenesisDoc(stateDB, genDoc); err != nil {
  981. return sm.State{}, nil, err
  982. }
  983. }
  984. stateStore := sm.NewStore(stateDB)
  985. state, err := stateStore.LoadFromDBOrGenesisDoc(genDoc)
  986. if err != nil {
  987. return sm.State{}, nil, err
  988. }
  989. return state, genDoc, nil
  990. }
  991. // panics if failed to unmarshal bytes
  992. func loadGenesisDoc(db dbm.DB) (*types.GenesisDoc, error) {
  993. b, err := db.Get(genesisDocKey)
  994. if err != nil {
  995. panic(err)
  996. }
  997. if len(b) == 0 {
  998. return nil, errors.New("genesis doc not found")
  999. }
  1000. var genDoc *types.GenesisDoc
  1001. err = tmjson.Unmarshal(b, &genDoc)
  1002. if err != nil {
  1003. panic(fmt.Sprintf("Failed to load genesis doc due to unmarshaling error: %v (bytes: %X)", err, b))
  1004. }
  1005. return genDoc, nil
  1006. }
  1007. // panics if failed to marshal the given genesis document
  1008. func saveGenesisDoc(db dbm.DB, genDoc *types.GenesisDoc) error {
  1009. b, err := tmjson.Marshal(genDoc)
  1010. if err != nil {
  1011. return fmt.Errorf("failed to save genesis doc due to marshaling error: %w", err)
  1012. }
  1013. if err := db.SetSync(genesisDocKey, b); err != nil {
  1014. return err
  1015. }
  1016. return nil
  1017. }
  1018. func createAndStartPrivValidatorSocketClient(
  1019. listenAddr,
  1020. chainID string,
  1021. logger log.Logger,
  1022. ) (types.PrivValidator, error) {
  1023. pve, err := privval.NewSignerListener(listenAddr, logger)
  1024. if err != nil {
  1025. return nil, fmt.Errorf("failed to start private validator: %w", err)
  1026. }
  1027. pvsc, err := privval.NewSignerClient(pve, chainID)
  1028. if err != nil {
  1029. return nil, fmt.Errorf("failed to start private validator: %w", err)
  1030. }
  1031. // try to get a pubkey from private validate first time
  1032. _, err = pvsc.GetPubKey(context.TODO())
  1033. if err != nil {
  1034. return nil, fmt.Errorf("can't get pubkey: %w", err)
  1035. }
  1036. const (
  1037. retries = 50 // 50 * 100ms = 5s total
  1038. timeout = 100 * time.Millisecond
  1039. )
  1040. pvscWithRetries := privval.NewRetrySignerClient(pvsc, retries, timeout)
  1041. return pvscWithRetries, nil
  1042. }
  1043. func createAndStartPrivValidatorGRPCClient(
  1044. config *cfg.Config,
  1045. chainID string,
  1046. logger log.Logger,
  1047. ) (types.PrivValidator, error) {
  1048. pvsc, err := tmgrpc.DialRemoteSigner(config, chainID, logger)
  1049. if err != nil {
  1050. return nil, fmt.Errorf("failed to start private validator: %w", err)
  1051. }
  1052. // try to get a pubkey from private validate first time
  1053. _, err = pvsc.GetPubKey(context.TODO())
  1054. if err != nil {
  1055. return nil, fmt.Errorf("can't get pubkey: %w", err)
  1056. }
  1057. return pvsc, nil
  1058. }
  1059. func getRouterConfig(conf *cfg.Config, proxyApp proxy.AppConns) p2p.RouterOptions {
  1060. opts := p2p.RouterOptions{
  1061. QueueType: conf.P2P.QueueType,
  1062. }
  1063. if conf.P2P.MaxNumInboundPeers > 0 {
  1064. opts.MaxIncomingConnectionAttempts = conf.P2P.MaxIncomingConnectionAttempts
  1065. }
  1066. if conf.FilterPeers && proxyApp != nil {
  1067. opts.FilterPeerByID = func(ctx context.Context, id p2p.NodeID) error {
  1068. res, err := proxyApp.Query().QuerySync(context.Background(), abci.RequestQuery{
  1069. Path: fmt.Sprintf("/p2p/filter/id/%s", id),
  1070. })
  1071. if err != nil {
  1072. return err
  1073. }
  1074. if res.IsErr() {
  1075. return fmt.Errorf("error querying abci app: %v", res)
  1076. }
  1077. return nil
  1078. }
  1079. opts.FilterPeerByIP = func(ctx context.Context, ip net.IP, port uint16) error {
  1080. res, err := proxyApp.Query().QuerySync(ctx, abci.RequestQuery{
  1081. Path: fmt.Sprintf("/p2p/filter/addr/%s", net.JoinHostPort(ip.String(), strconv.Itoa(int(port)))),
  1082. })
  1083. if err != nil {
  1084. return err
  1085. }
  1086. if res.IsErr() {
  1087. return fmt.Errorf("error querying abci app: %v", res)
  1088. }
  1089. return nil
  1090. }
  1091. }
  1092. return opts
  1093. }
  1094. // FIXME: Temporary helper function, shims should be removed.
  1095. func makeChannelsFromShims(
  1096. router *p2p.Router,
  1097. chShims map[p2p.ChannelID]*p2p.ChannelDescriptorShim,
  1098. ) map[p2p.ChannelID]*p2p.Channel {
  1099. channels := map[p2p.ChannelID]*p2p.Channel{}
  1100. for chID, chShim := range chShims {
  1101. ch, err := router.OpenChannel(*chShim.Descriptor, chShim.MsgType, chShim.Descriptor.RecvBufferCapacity)
  1102. if err != nil {
  1103. panic(fmt.Sprintf("failed to open channel %v: %v", chID, err))
  1104. }
  1105. channels[chID] = ch
  1106. }
  1107. return channels
  1108. }
  1109. func getChannelsFromShim(reactorShim *p2p.ReactorShim) map[p2p.ChannelID]*p2p.Channel {
  1110. channels := map[p2p.ChannelID]*p2p.Channel{}
  1111. for chID := range reactorShim.Channels {
  1112. channels[chID] = reactorShim.GetChannel(chID)
  1113. }
  1114. return channels
  1115. }