You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1201 lines
37 KiB

privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
6 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
node: refactor node.NewNode (#3456) The node.NewNode method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the node.TestCreateProposalBlock test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. See also this gist https://gist.github.com/thanethomson/56e1640d057a26186e38ad678a1d114c for some background work done when starting to refactor here. ## Commits: * [WIP] Refactor node.NewNode to simplify The `node.NewNode` method is pretty complex at the moment, an in order to address issues like #3156, we need to simplify the interface for partial node instantiation. In some places, we don't need to build up a full node (like in the `node.TestCreateProposalBlock` test), but the complexity of such partial instantiation needs to be reduced. This PR aims to eventually make this easier/simpler. * Refactor state loading and genesis doc provider into state package * Refactor for clarity of return parameters * Fix incorrect capitalization of error messages * Simplify extracted functions' names * Document optionally-prefixed functions * Refactor optionallyFastSync for clarity of separation of concerns * Restructure function for early return * Restructure function for early return * Remove dependence on deprecated panic functions * refactor code a bit more plus, expose PEXReactor on node * align logger names * add a changelog entry * align logger names 2 * add a note about PEXReactor returning nil
6 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
p2p: implement new Transport interface (#5791) This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog. The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack. The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely. There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
4 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
6 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
6 years ago
limit number of /subscribe clients and queries per client (#3269) * limit number of /subscribe clients and queries per client Add the following config variables (under [rpc] section): * max_subscription_clients * max_subscriptions_per_client * timeout_broadcast_tx_commit Fixes #2826 new HTTPClient interface for subscriptions finalize HTTPClient events interface remove EventSubscriber fix data race ``` WARNING: DATA RACE Read at 0x00c000a36060 by goroutine 129: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0 Previous write at 0x00c000a36060 by goroutine 132: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 129 (running) created at: github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe() /go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7 github.com/tendermint/tendermint/rpc/client.WaitForOneEvent() /go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Goroutine 132 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync() /go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ================== ``` lite client works (tested manually) godoc comments httpclient: do not close the out channel use TimeoutBroadcastTxCommit no timeout for unsubscribe but 1s Local (5s HTTP) timeout for resubscribe format code change Subscribe#out cap to 1 and replace config vars with RPCConfig TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout rpc: Context as first parameter to all functions reformat code fixes after my own review fixes after Ethan's review add test stubs fix config.toml * fixes after manual testing - rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes Tendermint resources (pubsub) - rpc: better error in Subscribe and BroadcastTxCommit - HTTPClient: do not resubscribe if err = ErrAlreadySubscribed * fixes after Ismail's review * Update rpc/grpc/grpc_test.go Co-Authored-By: melekes <anton.kalyaev@gmail.com>
6 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
6 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
6 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
Close and retry a RemoteSigner on err (#2923) * Close and recreate a RemoteSigner on err * Update changelog * Address Anton's comments / suggestions: - update changelog - restart TCPVal - shut down on `ErrUnexpectedResponse` * re-init remote signer client with fresh connection if Ping fails - add/update TODOs in secret connection - rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose * account for `conn returned by waitConnection can be `nil` - also add TODO about RemoteSigner conn field * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn - add rwmutex for conn field in IPC * comments and doc.go * fix ipc tests. fixes #2677 * use constants for tests * cleanup some error statements * fixes #2784, race in tests * remove print statement * minor fixes from review * update comment on sts spec * cosmetics * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * IPCVal signer refactor - use a .reset() method - don't use embedded RemoteSignerClient - guard RemoteSignerClient with mutex - drop the .conn - expose Close() on RemoteSignerClient * apply IPCVal refactor to TCPVal * remove mtx from RemoteSignerClient * consolidate IPCVal and TCPVal, fixes #3104 - done in tcp_client.go - now called SocketVal - takes a listener in the constructor - make tcpListener and unixListener contain all the differences * delete ipc files * introduce unix and tcp dialer for RemoteSigner * rename files - drop tcp_ prefix - rename priv_validator.go to file.go * bring back listener options * fix node * fix priv_val_server * fix node test * minor cleanup and comments
6 years ago
privval: refactor Remote signers (#3370) This PR is related to #3107 and a continuation of #3351 It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction. Given two hosts A and B: Host A is listener/client Host B is dialer/server (contains the secret key) When A requires a signature, it needs to wait for B to dial in before it can issue a request. A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect. The original rationale behind this design was based on security. Host B only allows outbound connections to a list of whitelisted hosts. It is not possible to reach B unless B dials in. There are no listening/open ports in B. This PR results in the following changes: Refactors ping/heartbeat to avoid previously existing race conditions. Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow. Unifies and abstracts away the differences between unix and tcp sockets. A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj) The signer request handler (server side) is customizable to increase testability. Updates and extends unit tests A high level overview of the classes is as follows: Transport (endpoints): The following classes take care of establishing a connection SignerDialerEndpoint SignerListeningEndpoint SignerEndpoint groups common functionality (read/write/timeouts/etc.) Signing (client/server): The following classes take care of exchanging request/responses SignerClient SignerServer This PR also closes #3601 Commits: * refactoring - work in progress * reworking unit tests * Encapsulating and fixing unit tests * Improve tests * Clean up * Fix/improve unit tests * clean up tests * Improving service endpoint * fixing unit test * fix linter issues * avoid invalid cache values (improve later?) * complete implementation * wip * improved connection loop * Improve reconnections + fixing unit tests * addressing comments * small formatting changes * clean up * Update node/node.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * check during initialization * dropping connecting when writing fails * removing break * use t.log instead * unifying and using cmn.GetFreePort() * review fixes * reordering and unifying drop connection * closing instead of signalling * refactored service loop * removed superfluous brackets * GetPubKey can return errors * Revert "GetPubKey can return errors" This reverts commit 68c06f19b4650389d7e5ab1659b318889028202c. * adding entry to changelog * Update CHANGELOG_PENDING.md Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_client.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_dialer_endpoint.go Co-Authored-By: jleni <juan.leni@zondax.ch> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: jleni <juan.leni@zondax.ch> * updating node.go * review fixes * fixes linter * fixing unit test * small fixes in comments * addressing review comments * addressing review comments 2 * reverting suggestion * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_client_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * Update privval/signer_listener_endpoint_test.go Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com> * do not expose brokenSignerDialerEndpoint * clean up logging * unifying methods shorten test time signer also drops * reenabling pings * improving testability + unit test * fixing go fmt + unit test * remove unused code * Addressing review comments * simplifying connection workflow * fix linter/go import issue * using base service quit * updating comment * Simplifying design + adjusting names * fixing linter issues * refactoring test harness + fixes * Addressing review comments * cleaning up * adding additional error check
5 years ago
  1. package node
  2. import (
  3. "context"
  4. "errors"
  5. "fmt"
  6. "net"
  7. "net/http"
  8. _ "net/http/pprof" // nolint: gosec // securely exposed on separate, optional port
  9. "strconv"
  10. "time"
  11. _ "github.com/lib/pq" // provide the psql db driver
  12. "github.com/prometheus/client_golang/prometheus"
  13. "github.com/prometheus/client_golang/prometheus/promhttp"
  14. "github.com/rs/cors"
  15. abci "github.com/tendermint/tendermint/abci/types"
  16. cfg "github.com/tendermint/tendermint/config"
  17. "github.com/tendermint/tendermint/crypto"
  18. cs "github.com/tendermint/tendermint/internal/consensus"
  19. "github.com/tendermint/tendermint/internal/mempool"
  20. "github.com/tendermint/tendermint/internal/p2p"
  21. "github.com/tendermint/tendermint/internal/p2p/pex"
  22. "github.com/tendermint/tendermint/internal/statesync"
  23. "github.com/tendermint/tendermint/libs/log"
  24. tmnet "github.com/tendermint/tendermint/libs/net"
  25. tmpubsub "github.com/tendermint/tendermint/libs/pubsub"
  26. "github.com/tendermint/tendermint/libs/service"
  27. "github.com/tendermint/tendermint/libs/strings"
  28. tmtime "github.com/tendermint/tendermint/libs/time"
  29. "github.com/tendermint/tendermint/privval"
  30. tmgrpc "github.com/tendermint/tendermint/privval/grpc"
  31. "github.com/tendermint/tendermint/proxy"
  32. rpccore "github.com/tendermint/tendermint/rpc/core"
  33. grpccore "github.com/tendermint/tendermint/rpc/grpc"
  34. rpcserver "github.com/tendermint/tendermint/rpc/jsonrpc/server"
  35. sm "github.com/tendermint/tendermint/state"
  36. "github.com/tendermint/tendermint/store"
  37. "github.com/tendermint/tendermint/types"
  38. )
  39. // nodeImpl is the highest level interface to a full Tendermint node.
  40. // It includes all configuration information and running services.
  41. type nodeImpl struct {
  42. service.BaseService
  43. // config
  44. config *cfg.Config
  45. genesisDoc *types.GenesisDoc // initial validator set
  46. privValidator types.PrivValidator // local node's validator key
  47. // network
  48. transport *p2p.MConnTransport
  49. sw *p2p.Switch // p2p connections
  50. peerManager *p2p.PeerManager
  51. router *p2p.Router
  52. addrBook pex.AddrBook // known peers
  53. nodeInfo types.NodeInfo
  54. nodeKey types.NodeKey // our node privkey
  55. isListening bool
  56. // services
  57. eventBus *types.EventBus // pub/sub for services
  58. stateStore sm.Store
  59. blockStore *store.BlockStore // store the blockchain to disk
  60. bcReactor service.Service // for block-syncing
  61. mempoolReactor service.Service // for gossipping transactions
  62. mempool mempool.Mempool
  63. stateSync bool // whether the node should state sync on startup
  64. stateSyncReactor *statesync.Reactor // for hosting and restoring state sync snapshots
  65. consensusReactor *cs.Reactor // for participating in the consensus
  66. pexReactor service.Service // for exchanging peer addresses
  67. evidenceReactor service.Service
  68. rpcListeners []net.Listener // rpc servers
  69. indexerService service.Service
  70. rpcEnv *rpccore.Environment
  71. prometheusSrv *http.Server
  72. }
  73. // newDefaultNode returns a Tendermint node with default settings for the
  74. // PrivValidator, ClientCreator, GenesisDoc, and DBProvider.
  75. // It implements NodeProvider.
  76. func newDefaultNode(config *cfg.Config, logger log.Logger) (service.Service, error) {
  77. nodeKey, err := types.LoadOrGenNodeKey(config.NodeKeyFile())
  78. if err != nil {
  79. return nil, fmt.Errorf("failed to load or gen node key %s: %w", config.NodeKeyFile(), err)
  80. }
  81. if config.Mode == cfg.ModeSeed {
  82. return makeSeedNode(config,
  83. cfg.DefaultDBProvider,
  84. nodeKey,
  85. defaultGenesisDocProviderFunc(config),
  86. logger,
  87. )
  88. }
  89. var pval *privval.FilePV
  90. if config.Mode == cfg.ModeValidator {
  91. pval, err = privval.LoadOrGenFilePV(config.PrivValidator.KeyFile(), config.PrivValidator.StateFile())
  92. if err != nil {
  93. return nil, err
  94. }
  95. } else {
  96. pval = nil
  97. }
  98. appClient, _ := proxy.DefaultClientCreator(config.ProxyApp, config.ABCI, config.DBDir())
  99. return makeNode(config,
  100. pval,
  101. nodeKey,
  102. appClient,
  103. defaultGenesisDocProviderFunc(config),
  104. cfg.DefaultDBProvider,
  105. logger,
  106. )
  107. }
  108. // makeNode returns a new, ready to go, Tendermint Node.
  109. func makeNode(config *cfg.Config,
  110. privValidator types.PrivValidator,
  111. nodeKey types.NodeKey,
  112. clientCreator proxy.ClientCreator,
  113. genesisDocProvider genesisDocProvider,
  114. dbProvider cfg.DBProvider,
  115. logger log.Logger) (service.Service, error) {
  116. blockStore, stateDB, err := initDBs(config, dbProvider)
  117. if err != nil {
  118. return nil, err
  119. }
  120. stateStore := sm.NewStore(stateDB)
  121. genDoc, err := genesisDocProvider()
  122. if err != nil {
  123. return nil, err
  124. }
  125. err = genDoc.ValidateAndComplete()
  126. if err != nil {
  127. return nil, fmt.Errorf("error in genesis doc: %w", err)
  128. }
  129. state, err := loadStateFromDBOrGenesisDocProvider(stateStore, genDoc)
  130. if err != nil {
  131. return nil, err
  132. }
  133. // Create the proxyApp and establish connections to the ABCI app (consensus, mempool, query).
  134. proxyApp, err := createAndStartProxyAppConns(clientCreator, logger)
  135. if err != nil {
  136. return nil, err
  137. }
  138. // EventBus and IndexerService must be started before the handshake because
  139. // we might need to index the txs of the replayed block as this might not have happened
  140. // when the node stopped last time (i.e. the node stopped after it saved the block
  141. // but before it indexed the txs, or, endblocker panicked)
  142. eventBus, err := createAndStartEventBus(logger)
  143. if err != nil {
  144. return nil, err
  145. }
  146. indexerService, eventSinks, err := createAndStartIndexerService(config, dbProvider, eventBus, logger, genDoc.ChainID)
  147. if err != nil {
  148. return nil, err
  149. }
  150. // If an address is provided, listen on the socket for a connection from an
  151. // external signing process.
  152. if config.PrivValidator.ListenAddr != "" {
  153. protocol, _ := tmnet.ProtocolAndAddress(config.PrivValidator.ListenAddr)
  154. // FIXME: we should start services inside OnStart
  155. switch protocol {
  156. case "grpc":
  157. privValidator, err = createAndStartPrivValidatorGRPCClient(config, genDoc.ChainID, logger)
  158. if err != nil {
  159. return nil, fmt.Errorf("error with private validator grpc client: %w", err)
  160. }
  161. default:
  162. privValidator, err = createAndStartPrivValidatorSocketClient(config.PrivValidator.ListenAddr, genDoc.ChainID, logger)
  163. if err != nil {
  164. return nil, fmt.Errorf("error with private validator socket client: %w", err)
  165. }
  166. }
  167. }
  168. var pubKey crypto.PubKey
  169. if config.Mode == cfg.ModeValidator {
  170. pubKey, err = privValidator.GetPubKey(context.TODO())
  171. if err != nil {
  172. return nil, fmt.Errorf("can't get pubkey: %w", err)
  173. }
  174. if pubKey == nil {
  175. return nil, errors.New("could not retrieve public key from private validator")
  176. }
  177. }
  178. // Determine whether we should attempt state sync.
  179. stateSync := config.StateSync.Enable && !onlyValidatorIsUs(state, pubKey)
  180. if stateSync && state.LastBlockHeight > 0 {
  181. logger.Info("Found local state with non-zero height, skipping state sync")
  182. stateSync = false
  183. }
  184. // Create the handshaker, which calls RequestInfo, sets the AppVersion on the state,
  185. // and replays any blocks as necessary to sync tendermint with the app.
  186. consensusLogger := logger.With("module", "consensus")
  187. if !stateSync {
  188. if err := doHandshake(stateStore, state, blockStore, genDoc, eventBus, proxyApp, consensusLogger); err != nil {
  189. return nil, err
  190. }
  191. // Reload the state. It will have the Version.Consensus.App set by the
  192. // Handshake, and may have other modifications as well (ie. depending on
  193. // what happened during block replay).
  194. state, err = stateStore.Load()
  195. if err != nil {
  196. return nil, fmt.Errorf("cannot load state: %w", err)
  197. }
  198. }
  199. // Determine whether we should do block sync. This must happen after the handshake, since the
  200. // app may modify the validator set, specifying ourself as the only validator.
  201. blockSync := config.BlockSync.Enable && !onlyValidatorIsUs(state, pubKey)
  202. logNodeStartupInfo(state, pubKey, logger, consensusLogger, config.Mode)
  203. // TODO: Fetch and provide real options and do proper p2p bootstrapping.
  204. // TODO: Use a persistent peer database.
  205. nodeInfo, err := makeNodeInfo(config, nodeKey, eventSinks, genDoc, state)
  206. if err != nil {
  207. return nil, err
  208. }
  209. p2pLogger := logger.With("module", "p2p")
  210. transport := createTransport(p2pLogger, config)
  211. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  212. if err != nil {
  213. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  214. }
  215. csMetrics, p2pMetrics, memplMetrics, smMetrics := defaultMetricsProvider(config.Instrumentation)(genDoc.ChainID)
  216. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  217. peerManager, transport, getRouterConfig(config, proxyApp))
  218. if err != nil {
  219. return nil, fmt.Errorf("failed to create router: %w", err)
  220. }
  221. mpReactorShim, mpReactor, mp, err := createMempoolReactor(
  222. config, proxyApp, state, memplMetrics, peerManager, router, logger,
  223. )
  224. if err != nil {
  225. return nil, err
  226. }
  227. evReactorShim, evReactor, evPool, err := createEvidenceReactor(
  228. config, dbProvider, stateDB, blockStore, peerManager, router, logger,
  229. )
  230. if err != nil {
  231. return nil, err
  232. }
  233. // make block executor for consensus and blockchain reactors to execute blocks
  234. blockExec := sm.NewBlockExecutor(
  235. stateStore,
  236. logger.With("module", "state"),
  237. proxyApp.Consensus(),
  238. mp,
  239. evPool,
  240. blockStore,
  241. sm.BlockExecutorWithMetrics(smMetrics),
  242. )
  243. csReactorShim, csReactor, csState := createConsensusReactor(
  244. config, state, blockExec, blockStore, mp, evPool,
  245. privValidator, csMetrics, stateSync || blockSync, eventBus,
  246. peerManager, router, consensusLogger,
  247. )
  248. // Create the blockchain reactor. Note, we do not start block sync if we're
  249. // doing a state sync first.
  250. bcReactorShim, bcReactor, err := createBlockchainReactor(
  251. logger, config, state, blockExec, blockStore, csReactor,
  252. peerManager, router, blockSync && !stateSync, csMetrics,
  253. )
  254. if err != nil {
  255. return nil, fmt.Errorf("could not create blockchain reactor: %w", err)
  256. }
  257. // TODO: Remove this once the switch is removed.
  258. var bcReactorForSwitch p2p.Reactor
  259. if bcReactorShim != nil {
  260. bcReactorForSwitch = bcReactorShim
  261. } else {
  262. bcReactorForSwitch = bcReactor.(p2p.Reactor)
  263. }
  264. // Make ConsensusReactor. Don't enable fully if doing a state sync and/or block sync first.
  265. // FIXME We need to update metrics here, since other reactors don't have access to them.
  266. if stateSync {
  267. csMetrics.StateSyncing.Set(1)
  268. } else if blockSync {
  269. csMetrics.BlockSyncing.Set(1)
  270. }
  271. // Set up state sync reactor, and schedule a sync if requested.
  272. // FIXME The way we do phased startups (e.g. replay -> block sync -> consensus) is very messy,
  273. // we should clean this whole thing up. See:
  274. // https://github.com/tendermint/tendermint/issues/4644
  275. var (
  276. stateSyncReactor *statesync.Reactor
  277. stateSyncReactorShim *p2p.ReactorShim
  278. channels map[p2p.ChannelID]*p2p.Channel
  279. peerUpdates *p2p.PeerUpdates
  280. )
  281. stateSyncReactorShim = p2p.NewReactorShim(logger.With("module", "statesync"), "StateSyncShim", statesync.ChannelShims)
  282. if config.P2P.UseLegacy {
  283. channels = getChannelsFromShim(stateSyncReactorShim)
  284. peerUpdates = stateSyncReactorShim.PeerUpdates
  285. } else {
  286. channels = makeChannelsFromShims(router, statesync.ChannelShims)
  287. peerUpdates = peerManager.Subscribe()
  288. }
  289. stateSyncReactor = statesync.NewReactor(
  290. genDoc.ChainID,
  291. genDoc.InitialHeight,
  292. *config.StateSync,
  293. stateSyncReactorShim.Logger,
  294. proxyApp.Snapshot(),
  295. proxyApp.Query(),
  296. channels[statesync.SnapshotChannel],
  297. channels[statesync.ChunkChannel],
  298. channels[statesync.LightBlockChannel],
  299. channels[statesync.ParamsChannel],
  300. peerUpdates,
  301. stateStore,
  302. blockStore,
  303. config.StateSync.TempDir,
  304. )
  305. // add the channel descriptors to both the transports
  306. // FIXME: This should be removed when the legacy p2p stack is removed and
  307. // transports can either be agnostic to channel descriptors or can be
  308. // declared in the constructor.
  309. transport.AddChannelDescriptors(mpReactorShim.GetChannels())
  310. transport.AddChannelDescriptors(bcReactorForSwitch.GetChannels())
  311. transport.AddChannelDescriptors(csReactorShim.GetChannels())
  312. transport.AddChannelDescriptors(evReactorShim.GetChannels())
  313. transport.AddChannelDescriptors(stateSyncReactorShim.GetChannels())
  314. // Optionally, start the pex reactor
  315. //
  316. // TODO:
  317. //
  318. // We need to set Seeds and PersistentPeers on the switch,
  319. // since it needs to be able to use these (and their DNS names)
  320. // even if the PEX is off. We can include the DNS name in the NetAddress,
  321. // but it would still be nice to have a clear list of the current "PersistentPeers"
  322. // somewhere that we can return with net_info.
  323. //
  324. // If PEX is on, it should handle dialing the seeds. Otherwise the switch does it.
  325. // Note we currently use the addrBook regardless at least for AddOurAddress
  326. var (
  327. pexReactor service.Service
  328. sw *p2p.Switch
  329. addrBook pex.AddrBook
  330. )
  331. pexCh := pex.ChannelDescriptor()
  332. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  333. if config.P2P.UseLegacy {
  334. // setup Transport and Switch
  335. sw = createSwitch(
  336. config, transport, p2pMetrics, mpReactorShim, bcReactorForSwitch,
  337. stateSyncReactorShim, csReactorShim, evReactorShim, proxyApp, nodeInfo, nodeKey, p2pLogger,
  338. )
  339. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  340. if err != nil {
  341. return nil, fmt.Errorf("could not add peers from persistent-peers field: %w", err)
  342. }
  343. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  344. if err != nil {
  345. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  346. }
  347. addrBook, err = createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  348. if err != nil {
  349. return nil, fmt.Errorf("could not create addrbook: %w", err)
  350. }
  351. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  352. } else {
  353. addrBook = nil
  354. pexReactor, err = createPEXReactorV2(config, logger, peerManager, router)
  355. if err != nil {
  356. return nil, err
  357. }
  358. }
  359. if config.RPC.PprofListenAddress != "" {
  360. go func() {
  361. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  362. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  363. }()
  364. }
  365. node := &nodeImpl{
  366. config: config,
  367. genesisDoc: genDoc,
  368. privValidator: privValidator,
  369. transport: transport,
  370. sw: sw,
  371. peerManager: peerManager,
  372. router: router,
  373. addrBook: addrBook,
  374. nodeInfo: nodeInfo,
  375. nodeKey: nodeKey,
  376. stateStore: stateStore,
  377. blockStore: blockStore,
  378. bcReactor: bcReactor,
  379. mempoolReactor: mpReactor,
  380. mempool: mp,
  381. consensusReactor: csReactor,
  382. stateSyncReactor: stateSyncReactor,
  383. stateSync: stateSync,
  384. pexReactor: pexReactor,
  385. evidenceReactor: evReactor,
  386. indexerService: indexerService,
  387. eventBus: eventBus,
  388. rpcEnv: &rpccore.Environment{
  389. ProxyAppQuery: proxyApp.Query(),
  390. ProxyAppMempool: proxyApp.Mempool(),
  391. StateStore: stateStore,
  392. BlockStore: blockStore,
  393. EvidencePool: evPool,
  394. ConsensusState: csState,
  395. ConsensusReactor: csReactor,
  396. BlockSyncReactor: bcReactor.(cs.BlockSyncReactor),
  397. P2PPeers: sw,
  398. PeerManager: peerManager,
  399. GenDoc: genDoc,
  400. EventSinks: eventSinks,
  401. EventBus: eventBus,
  402. Mempool: mp,
  403. Logger: logger.With("module", "rpc"),
  404. Config: *config.RPC,
  405. },
  406. }
  407. // this is a terrible, because typed nil interfaces are not ==
  408. // nil, so this is just cleanup to avoid having a non-nil
  409. // value in the RPC environment that has the semantic
  410. // properties of nil.
  411. if sw == nil {
  412. node.rpcEnv.P2PPeers = nil
  413. } else if peerManager == nil {
  414. node.rpcEnv.PeerManager = nil
  415. }
  416. // end hack
  417. node.rpcEnv.P2PTransport = node
  418. node.BaseService = *service.NewBaseService(logger, "Node", node)
  419. return node, nil
  420. }
  421. // makeSeedNode returns a new seed node, containing only p2p, pex reactor
  422. func makeSeedNode(config *cfg.Config,
  423. dbProvider cfg.DBProvider,
  424. nodeKey types.NodeKey,
  425. genesisDocProvider genesisDocProvider,
  426. logger log.Logger,
  427. ) (service.Service, error) {
  428. genDoc, err := genesisDocProvider()
  429. if err != nil {
  430. return nil, err
  431. }
  432. state, err := sm.MakeGenesisState(genDoc)
  433. if err != nil {
  434. return nil, err
  435. }
  436. nodeInfo, err := makeSeedNodeInfo(config, nodeKey, genDoc, state)
  437. if err != nil {
  438. return nil, err
  439. }
  440. // Setup Transport and Switch.
  441. p2pMetrics := p2p.PrometheusMetrics(config.Instrumentation.Namespace, "chain_id", genDoc.ChainID)
  442. p2pLogger := logger.With("module", "p2p")
  443. transport := createTransport(p2pLogger, config)
  444. peerManager, err := createPeerManager(config, dbProvider, p2pLogger, nodeKey.ID)
  445. if err != nil {
  446. return nil, fmt.Errorf("failed to create peer manager: %w", err)
  447. }
  448. router, err := createRouter(p2pLogger, p2pMetrics, nodeInfo, nodeKey.PrivKey,
  449. peerManager, transport, getRouterConfig(config, nil))
  450. if err != nil {
  451. return nil, fmt.Errorf("failed to create router: %w", err)
  452. }
  453. var (
  454. pexReactor service.Service
  455. sw *p2p.Switch
  456. addrBook pex.AddrBook
  457. )
  458. // add the pex reactor
  459. // FIXME: we add channel descriptors to both the router and the transport but only the router
  460. // should be aware of channel info. We should remove this from transport once the legacy
  461. // p2p stack is removed.
  462. pexCh := pex.ChannelDescriptor()
  463. transport.AddChannelDescriptors([]*p2p.ChannelDescriptor{&pexCh})
  464. if config.P2P.UseLegacy {
  465. sw = createSwitch(
  466. config, transport, p2pMetrics, nil, nil,
  467. nil, nil, nil, nil, nodeInfo, nodeKey, p2pLogger,
  468. )
  469. err = sw.AddPersistentPeers(strings.SplitAndTrimEmpty(config.P2P.PersistentPeers, ",", " "))
  470. if err != nil {
  471. return nil, fmt.Errorf("could not add peers from persistent_peers field: %w", err)
  472. }
  473. err = sw.AddUnconditionalPeerIDs(strings.SplitAndTrimEmpty(config.P2P.UnconditionalPeerIDs, ",", " "))
  474. if err != nil {
  475. return nil, fmt.Errorf("could not add peer ids from unconditional_peer_ids field: %w", err)
  476. }
  477. addrBook, err = createAddrBookAndSetOnSwitch(config, sw, p2pLogger, nodeKey)
  478. if err != nil {
  479. return nil, fmt.Errorf("could not create addrbook: %w", err)
  480. }
  481. pexReactor = createPEXReactorAndAddToSwitch(addrBook, config, sw, logger)
  482. } else {
  483. pexReactor, err = createPEXReactorV2(config, logger, peerManager, router)
  484. if err != nil {
  485. return nil, err
  486. }
  487. }
  488. if config.RPC.PprofListenAddress != "" {
  489. go func() {
  490. logger.Info("Starting pprof server", "laddr", config.RPC.PprofListenAddress)
  491. logger.Error("pprof server error", "err", http.ListenAndServe(config.RPC.PprofListenAddress, nil))
  492. }()
  493. }
  494. node := &nodeImpl{
  495. config: config,
  496. genesisDoc: genDoc,
  497. transport: transport,
  498. sw: sw,
  499. addrBook: addrBook,
  500. nodeInfo: nodeInfo,
  501. nodeKey: nodeKey,
  502. peerManager: peerManager,
  503. router: router,
  504. pexReactor: pexReactor,
  505. }
  506. node.BaseService = *service.NewBaseService(logger, "SeedNode", node)
  507. return node, nil
  508. }
  509. // OnStart starts the Node. It implements service.Service.
  510. func (n *nodeImpl) OnStart() error {
  511. now := tmtime.Now()
  512. genTime := n.genesisDoc.GenesisTime
  513. if genTime.After(now) {
  514. n.Logger.Info("Genesis time is in the future. Sleeping until then...", "genTime", genTime)
  515. time.Sleep(genTime.Sub(now))
  516. }
  517. // Start the RPC server before the P2P server
  518. // so we can eg. receive txs for the first block
  519. if n.config.RPC.ListenAddress != "" && n.config.Mode != cfg.ModeSeed {
  520. listeners, err := n.startRPC()
  521. if err != nil {
  522. return err
  523. }
  524. n.rpcListeners = listeners
  525. }
  526. if n.config.Instrumentation.Prometheus &&
  527. n.config.Instrumentation.PrometheusListenAddr != "" {
  528. n.prometheusSrv = n.startPrometheusServer(n.config.Instrumentation.PrometheusListenAddr)
  529. }
  530. // Start the transport.
  531. addr, err := types.NewNetAddressString(n.nodeKey.ID.AddressString(n.config.P2P.ListenAddress))
  532. if err != nil {
  533. return err
  534. }
  535. if err := n.transport.Listen(p2p.NewEndpoint(addr)); err != nil {
  536. return err
  537. }
  538. n.isListening = true
  539. n.Logger.Info("p2p service", "legacy_enabled", n.config.P2P.UseLegacy)
  540. if n.config.P2P.UseLegacy {
  541. // Add private IDs to addrbook to block those peers being added
  542. n.addrBook.AddPrivateIDs(strings.SplitAndTrimEmpty(n.config.P2P.PrivatePeerIDs, ",", " "))
  543. if err = n.sw.Start(); err != nil {
  544. return err
  545. }
  546. } else if err = n.router.Start(); err != nil {
  547. return err
  548. }
  549. if n.config.Mode != cfg.ModeSeed {
  550. if n.config.BlockSync.Version == cfg.BlockSyncV0 {
  551. if err := n.bcReactor.Start(); err != nil {
  552. return err
  553. }
  554. }
  555. // Start the real consensus reactor separately since the switch uses the shim.
  556. if err := n.consensusReactor.Start(); err != nil {
  557. return err
  558. }
  559. // Start the real state sync reactor separately since the switch uses the shim.
  560. if err := n.stateSyncReactor.Start(); err != nil {
  561. return err
  562. }
  563. // Start the real mempool reactor separately since the switch uses the shim.
  564. if err := n.mempoolReactor.Start(); err != nil {
  565. return err
  566. }
  567. // Start the real evidence reactor separately since the switch uses the shim.
  568. if err := n.evidenceReactor.Start(); err != nil {
  569. return err
  570. }
  571. }
  572. if n.config.P2P.UseLegacy {
  573. // Always connect to persistent peers
  574. err = n.sw.DialPeersAsync(strings.SplitAndTrimEmpty(n.config.P2P.PersistentPeers, ",", " "))
  575. if err != nil {
  576. return fmt.Errorf("could not dial peers from persistent-peers field: %w", err)
  577. }
  578. } else if err := n.pexReactor.Start(); err != nil {
  579. return err
  580. }
  581. // Run state sync
  582. // TODO: We shouldn't run state sync if we already have state that has a
  583. // LastBlockHeight that is not InitialHeight
  584. if n.stateSync {
  585. bcR, ok := n.bcReactor.(cs.BlockSyncReactor)
  586. if !ok {
  587. return fmt.Errorf("this blockchain reactor does not support switching from state sync")
  588. }
  589. // we need to get the genesis state to get parameters such as
  590. state, err := sm.MakeGenesisState(n.genesisDoc)
  591. if err != nil {
  592. return fmt.Errorf("unable to derive state: %w", err)
  593. }
  594. // TODO: we may want to move these events within the respective
  595. // reactors.
  596. // At the beginning of the statesync start, we use the initialHeight as the event height
  597. // because of the statesync doesn't have the concreate state height before fetched the snapshot.
  598. d := types.EventDataStateSyncStatus{Complete: false, Height: state.InitialHeight}
  599. if err := n.eventBus.PublishEventStateSyncStatus(d); err != nil {
  600. n.eventBus.Logger.Error("failed to emit the statesync start event", "err", err)
  601. }
  602. // FIXME: We shouldn't allow state sync to silently error out without
  603. // bubbling up the error and gracefully shutting down the rest of the node
  604. go func() {
  605. n.Logger.Info("starting state sync")
  606. state, err := n.stateSyncReactor.Sync(context.TODO())
  607. if err != nil {
  608. n.Logger.Error("state sync failed", "err", err)
  609. return
  610. }
  611. n.consensusReactor.SetStateSyncingMetrics(0)
  612. d := types.EventDataStateSyncStatus{Complete: true, Height: state.LastBlockHeight}
  613. if err := n.eventBus.PublishEventStateSyncStatus(d); err != nil {
  614. n.eventBus.Logger.Error("failed to emit the statesync start event", "err", err)
  615. }
  616. // TODO: Some form of orchestrator is needed here between the state
  617. // advancing reactors to be able to control which one of the three
  618. // is running
  619. if n.config.BlockSync.Enable {
  620. // FIXME Very ugly to have these metrics bleed through here.
  621. n.consensusReactor.SetBlockSyncingMetrics(1)
  622. if err := bcR.SwitchToBlockSync(state); err != nil {
  623. n.Logger.Error("failed to switch to block sync", "err", err)
  624. return
  625. }
  626. d := types.EventDataBlockSyncStatus{Complete: false, Height: state.LastBlockHeight}
  627. if err := n.eventBus.PublishEventBlockSyncStatus(d); err != nil {
  628. n.eventBus.Logger.Error("failed to emit the block sync starting event", "err", err)
  629. }
  630. } else {
  631. n.consensusReactor.SwitchToConsensus(state, true)
  632. }
  633. }()
  634. }
  635. return nil
  636. }
  637. // OnStop stops the Node. It implements service.Service.
  638. func (n *nodeImpl) OnStop() {
  639. n.Logger.Info("Stopping Node")
  640. // first stop the non-reactor services
  641. if err := n.eventBus.Stop(); err != nil {
  642. n.Logger.Error("Error closing eventBus", "err", err)
  643. }
  644. if err := n.indexerService.Stop(); err != nil {
  645. n.Logger.Error("Error closing indexerService", "err", err)
  646. }
  647. if n.config.Mode != cfg.ModeSeed {
  648. // now stop the reactors
  649. if n.config.BlockSync.Version == cfg.BlockSyncV0 {
  650. // Stop the real blockchain reactor separately since the switch uses the shim.
  651. if err := n.bcReactor.Stop(); err != nil {
  652. n.Logger.Error("failed to stop the blockchain reactor", "err", err)
  653. }
  654. }
  655. // Stop the real consensus reactor separately since the switch uses the shim.
  656. if err := n.consensusReactor.Stop(); err != nil {
  657. n.Logger.Error("failed to stop the consensus reactor", "err", err)
  658. }
  659. // Stop the real state sync reactor separately since the switch uses the shim.
  660. if err := n.stateSyncReactor.Stop(); err != nil {
  661. n.Logger.Error("failed to stop the state sync reactor", "err", err)
  662. }
  663. // Stop the real mempool reactor separately since the switch uses the shim.
  664. if err := n.mempoolReactor.Stop(); err != nil {
  665. n.Logger.Error("failed to stop the mempool reactor", "err", err)
  666. }
  667. // Stop the real evidence reactor separately since the switch uses the shim.
  668. if err := n.evidenceReactor.Stop(); err != nil {
  669. n.Logger.Error("failed to stop the evidence reactor", "err", err)
  670. }
  671. }
  672. if err := n.pexReactor.Stop(); err != nil {
  673. n.Logger.Error("failed to stop the PEX v2 reactor", "err", err)
  674. }
  675. if n.config.P2P.UseLegacy {
  676. if err := n.sw.Stop(); err != nil {
  677. n.Logger.Error("failed to stop switch", "err", err)
  678. }
  679. } else {
  680. if err := n.router.Stop(); err != nil {
  681. n.Logger.Error("failed to stop router", "err", err)
  682. }
  683. }
  684. if err := n.transport.Close(); err != nil {
  685. n.Logger.Error("Error closing transport", "err", err)
  686. }
  687. n.isListening = false
  688. // finally stop the listeners / external services
  689. for _, l := range n.rpcListeners {
  690. n.Logger.Info("Closing rpc listener", "listener", l)
  691. if err := l.Close(); err != nil {
  692. n.Logger.Error("Error closing listener", "listener", l, "err", err)
  693. }
  694. }
  695. if pvsc, ok := n.privValidator.(service.Service); ok {
  696. if err := pvsc.Stop(); err != nil {
  697. n.Logger.Error("Error closing private validator", "err", err)
  698. }
  699. }
  700. if n.prometheusSrv != nil {
  701. if err := n.prometheusSrv.Shutdown(context.Background()); err != nil {
  702. // Error from closing listeners, or context timeout:
  703. n.Logger.Error("Prometheus HTTP server Shutdown", "err", err)
  704. }
  705. }
  706. }
  707. func (n *nodeImpl) startRPC() ([]net.Listener, error) {
  708. if n.config.Mode == cfg.ModeValidator {
  709. pubKey, err := n.privValidator.GetPubKey(context.TODO())
  710. if pubKey == nil || err != nil {
  711. return nil, fmt.Errorf("can't get pubkey: %w", err)
  712. }
  713. n.rpcEnv.PubKey = pubKey
  714. }
  715. if err := n.rpcEnv.InitGenesisChunks(); err != nil {
  716. return nil, err
  717. }
  718. listenAddrs := strings.SplitAndTrimEmpty(n.config.RPC.ListenAddress, ",", " ")
  719. routes := n.rpcEnv.GetRoutes()
  720. if n.config.RPC.Unsafe {
  721. n.rpcEnv.AddUnsafe(routes)
  722. }
  723. config := rpcserver.DefaultConfig()
  724. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  725. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  726. config.MaxOpenConnections = n.config.RPC.MaxOpenConnections
  727. // If necessary adjust global WriteTimeout to ensure it's greater than
  728. // TimeoutBroadcastTxCommit.
  729. // See https://github.com/tendermint/tendermint/issues/3435
  730. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  731. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  732. }
  733. // we may expose the rpc over both a unix and tcp socket
  734. listeners := make([]net.Listener, len(listenAddrs))
  735. for i, listenAddr := range listenAddrs {
  736. mux := http.NewServeMux()
  737. rpcLogger := n.Logger.With("module", "rpc-server")
  738. wmLogger := rpcLogger.With("protocol", "websocket")
  739. wm := rpcserver.NewWebsocketManager(routes,
  740. rpcserver.OnDisconnect(func(remoteAddr string) {
  741. err := n.eventBus.UnsubscribeAll(context.Background(), remoteAddr)
  742. if err != nil && err != tmpubsub.ErrSubscriptionNotFound {
  743. wmLogger.Error("Failed to unsubscribe addr from events", "addr", remoteAddr, "err", err)
  744. }
  745. }),
  746. rpcserver.ReadLimit(config.MaxBodyBytes),
  747. )
  748. wm.SetLogger(wmLogger)
  749. mux.HandleFunc("/websocket", wm.WebsocketHandler)
  750. rpcserver.RegisterRPCFuncs(mux, routes, rpcLogger)
  751. listener, err := rpcserver.Listen(
  752. listenAddr,
  753. config.MaxOpenConnections,
  754. )
  755. if err != nil {
  756. return nil, err
  757. }
  758. var rootHandler http.Handler = mux
  759. if n.config.RPC.IsCorsEnabled() {
  760. corsMiddleware := cors.New(cors.Options{
  761. AllowedOrigins: n.config.RPC.CORSAllowedOrigins,
  762. AllowedMethods: n.config.RPC.CORSAllowedMethods,
  763. AllowedHeaders: n.config.RPC.CORSAllowedHeaders,
  764. })
  765. rootHandler = corsMiddleware.Handler(mux)
  766. }
  767. if n.config.RPC.IsTLSEnabled() {
  768. go func() {
  769. if err := rpcserver.ServeTLS(
  770. listener,
  771. rootHandler,
  772. n.config.RPC.CertFile(),
  773. n.config.RPC.KeyFile(),
  774. rpcLogger,
  775. config,
  776. ); err != nil {
  777. n.Logger.Error("Error serving server with TLS", "err", err)
  778. }
  779. }()
  780. } else {
  781. go func() {
  782. if err := rpcserver.Serve(
  783. listener,
  784. rootHandler,
  785. rpcLogger,
  786. config,
  787. ); err != nil {
  788. n.Logger.Error("Error serving server", "err", err)
  789. }
  790. }()
  791. }
  792. listeners[i] = listener
  793. }
  794. // we expose a simplified api over grpc for convenience to app devs
  795. grpcListenAddr := n.config.RPC.GRPCListenAddress
  796. if grpcListenAddr != "" {
  797. config := rpcserver.DefaultConfig()
  798. config.MaxBodyBytes = n.config.RPC.MaxBodyBytes
  799. config.MaxHeaderBytes = n.config.RPC.MaxHeaderBytes
  800. // NOTE: GRPCMaxOpenConnections is used, not MaxOpenConnections
  801. config.MaxOpenConnections = n.config.RPC.GRPCMaxOpenConnections
  802. // If necessary adjust global WriteTimeout to ensure it's greater than
  803. // TimeoutBroadcastTxCommit.
  804. // See https://github.com/tendermint/tendermint/issues/3435
  805. if config.WriteTimeout <= n.config.RPC.TimeoutBroadcastTxCommit {
  806. config.WriteTimeout = n.config.RPC.TimeoutBroadcastTxCommit + 1*time.Second
  807. }
  808. listener, err := rpcserver.Listen(grpcListenAddr, config.MaxOpenConnections)
  809. if err != nil {
  810. return nil, err
  811. }
  812. go func() {
  813. if err := grpccore.StartGRPCServer(n.rpcEnv, listener); err != nil {
  814. n.Logger.Error("Error starting gRPC server", "err", err)
  815. }
  816. }()
  817. listeners = append(listeners, listener)
  818. }
  819. return listeners, nil
  820. }
  821. // startPrometheusServer starts a Prometheus HTTP server, listening for metrics
  822. // collectors on addr.
  823. func (n *nodeImpl) startPrometheusServer(addr string) *http.Server {
  824. srv := &http.Server{
  825. Addr: addr,
  826. Handler: promhttp.InstrumentMetricHandler(
  827. prometheus.DefaultRegisterer, promhttp.HandlerFor(
  828. prometheus.DefaultGatherer,
  829. promhttp.HandlerOpts{MaxRequestsInFlight: n.config.Instrumentation.MaxOpenConnections},
  830. ),
  831. ),
  832. }
  833. go func() {
  834. if err := srv.ListenAndServe(); err != http.ErrServerClosed {
  835. // Error starting or closing listener:
  836. n.Logger.Error("Prometheus HTTP server ListenAndServe", "err", err)
  837. }
  838. }()
  839. return srv
  840. }
  841. // ConsensusReactor returns the Node's ConsensusReactor.
  842. func (n *nodeImpl) ConsensusReactor() *cs.Reactor {
  843. return n.consensusReactor
  844. }
  845. // Mempool returns the Node's mempool.
  846. func (n *nodeImpl) Mempool() mempool.Mempool {
  847. return n.mempool
  848. }
  849. // EventBus returns the Node's EventBus.
  850. func (n *nodeImpl) EventBus() *types.EventBus {
  851. return n.eventBus
  852. }
  853. // PrivValidator returns the Node's PrivValidator.
  854. // XXX: for convenience only!
  855. func (n *nodeImpl) PrivValidator() types.PrivValidator {
  856. return n.privValidator
  857. }
  858. // GenesisDoc returns the Node's GenesisDoc.
  859. func (n *nodeImpl) GenesisDoc() *types.GenesisDoc {
  860. return n.genesisDoc
  861. }
  862. // RPCEnvironment makes sure RPC has all the objects it needs to operate.
  863. func (n *nodeImpl) RPCEnvironment() *rpccore.Environment {
  864. return n.rpcEnv
  865. }
  866. //------------------------------------------------------------------------------
  867. func (n *nodeImpl) Listeners() []string {
  868. return []string{
  869. fmt.Sprintf("Listener(@%v)", n.config.P2P.ExternalAddress),
  870. }
  871. }
  872. func (n *nodeImpl) IsListening() bool {
  873. return n.isListening
  874. }
  875. // NodeInfo returns the Node's Info from the Switch.
  876. func (n *nodeImpl) NodeInfo() types.NodeInfo {
  877. return n.nodeInfo
  878. }
  879. // genesisDocProvider returns a GenesisDoc.
  880. // It allows the GenesisDoc to be pulled from sources other than the
  881. // filesystem, for instance from a distributed key-value store cluster.
  882. type genesisDocProvider func() (*types.GenesisDoc, error)
  883. // defaultGenesisDocProviderFunc returns a GenesisDocProvider that loads
  884. // the GenesisDoc from the config.GenesisFile() on the filesystem.
  885. func defaultGenesisDocProviderFunc(config *cfg.Config) genesisDocProvider {
  886. return func() (*types.GenesisDoc, error) {
  887. return types.GenesisDocFromFile(config.GenesisFile())
  888. }
  889. }
  890. // metricsProvider returns a consensus, p2p and mempool Metrics.
  891. type metricsProvider func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics)
  892. // defaultMetricsProvider returns Metrics build using Prometheus client library
  893. // if Prometheus is enabled. Otherwise, it returns no-op Metrics.
  894. func defaultMetricsProvider(config *cfg.InstrumentationConfig) metricsProvider {
  895. return func(chainID string) (*cs.Metrics, *p2p.Metrics, *mempool.Metrics, *sm.Metrics) {
  896. if config.Prometheus {
  897. return cs.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  898. p2p.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  899. mempool.PrometheusMetrics(config.Namespace, "chain_id", chainID),
  900. sm.PrometheusMetrics(config.Namespace, "chain_id", chainID)
  901. }
  902. return cs.NopMetrics(), p2p.NopMetrics(), mempool.NopMetrics(), sm.NopMetrics()
  903. }
  904. }
  905. //------------------------------------------------------------------------------
  906. // loadStateFromDBOrGenesisDocProvider attempts to load the state from the
  907. // database, or creates one using the given genesisDocProvider. On success this also
  908. // returns the genesis doc loaded through the given provider.
  909. func loadStateFromDBOrGenesisDocProvider(
  910. stateStore sm.Store,
  911. genDoc *types.GenesisDoc,
  912. ) (sm.State, error) {
  913. // 1. Attempt to load state form the database
  914. state, err := stateStore.Load()
  915. if err != nil {
  916. return sm.State{}, err
  917. }
  918. if state.IsEmpty() {
  919. // 2. If it's not there, derive it from the genesis doc
  920. state, err = sm.MakeGenesisState(genDoc)
  921. if err != nil {
  922. return sm.State{}, err
  923. }
  924. }
  925. return state, nil
  926. }
  927. func createAndStartPrivValidatorSocketClient(
  928. listenAddr,
  929. chainID string,
  930. logger log.Logger,
  931. ) (types.PrivValidator, error) {
  932. pve, err := privval.NewSignerListener(listenAddr, logger)
  933. if err != nil {
  934. return nil, fmt.Errorf("failed to start private validator: %w", err)
  935. }
  936. pvsc, err := privval.NewSignerClient(pve, chainID)
  937. if err != nil {
  938. return nil, fmt.Errorf("failed to start private validator: %w", err)
  939. }
  940. // try to get a pubkey from private validate first time
  941. _, err = pvsc.GetPubKey(context.TODO())
  942. if err != nil {
  943. return nil, fmt.Errorf("can't get pubkey: %w", err)
  944. }
  945. const (
  946. retries = 50 // 50 * 100ms = 5s total
  947. timeout = 100 * time.Millisecond
  948. )
  949. pvscWithRetries := privval.NewRetrySignerClient(pvsc, retries, timeout)
  950. return pvscWithRetries, nil
  951. }
  952. func createAndStartPrivValidatorGRPCClient(
  953. config *cfg.Config,
  954. chainID string,
  955. logger log.Logger,
  956. ) (types.PrivValidator, error) {
  957. pvsc, err := tmgrpc.DialRemoteSigner(
  958. config.PrivValidator,
  959. chainID,
  960. logger,
  961. config.Instrumentation.Prometheus,
  962. )
  963. if err != nil {
  964. return nil, fmt.Errorf("failed to start private validator: %w", err)
  965. }
  966. // try to get a pubkey from private validate first time
  967. _, err = pvsc.GetPubKey(context.TODO())
  968. if err != nil {
  969. return nil, fmt.Errorf("can't get pubkey: %w", err)
  970. }
  971. return pvsc, nil
  972. }
  973. func getRouterConfig(conf *cfg.Config, proxyApp proxy.AppConns) p2p.RouterOptions {
  974. opts := p2p.RouterOptions{
  975. QueueType: conf.P2P.QueueType,
  976. }
  977. if conf.P2P.MaxNumInboundPeers > 0 {
  978. opts.MaxIncomingConnectionAttempts = conf.P2P.MaxIncomingConnectionAttempts
  979. }
  980. if conf.FilterPeers && proxyApp != nil {
  981. opts.FilterPeerByID = func(ctx context.Context, id types.NodeID) error {
  982. res, err := proxyApp.Query().QuerySync(context.Background(), abci.RequestQuery{
  983. Path: fmt.Sprintf("/p2p/filter/id/%s", id),
  984. })
  985. if err != nil {
  986. return err
  987. }
  988. if res.IsErr() {
  989. return fmt.Errorf("error querying abci app: %v", res)
  990. }
  991. return nil
  992. }
  993. opts.FilterPeerByIP = func(ctx context.Context, ip net.IP, port uint16) error {
  994. res, err := proxyApp.Query().QuerySync(ctx, abci.RequestQuery{
  995. Path: fmt.Sprintf("/p2p/filter/addr/%s", net.JoinHostPort(ip.String(), strconv.Itoa(int(port)))),
  996. })
  997. if err != nil {
  998. return err
  999. }
  1000. if res.IsErr() {
  1001. return fmt.Errorf("error querying abci app: %v", res)
  1002. }
  1003. return nil
  1004. }
  1005. }
  1006. return opts
  1007. }
  1008. // FIXME: Temporary helper function, shims should be removed.
  1009. func makeChannelsFromShims(
  1010. router *p2p.Router,
  1011. chShims map[p2p.ChannelID]*p2p.ChannelDescriptorShim,
  1012. ) map[p2p.ChannelID]*p2p.Channel {
  1013. channels := map[p2p.ChannelID]*p2p.Channel{}
  1014. for chID, chShim := range chShims {
  1015. ch, err := router.OpenChannel(*chShim.Descriptor, chShim.MsgType, chShim.Descriptor.RecvBufferCapacity)
  1016. if err != nil {
  1017. panic(fmt.Sprintf("failed to open channel %v: %v", chID, err))
  1018. }
  1019. channels[chID] = ch
  1020. }
  1021. return channels
  1022. }
  1023. func getChannelsFromShim(reactorShim *p2p.ReactorShim) map[p2p.ChannelID]*p2p.Channel {
  1024. channels := map[p2p.ChannelID]*p2p.Channel{}
  1025. for chID := range reactorShim.Channels {
  1026. channels[chID] = reactorShim.GetChannel(chID)
  1027. }
  1028. return channels
  1029. }