This is, perhaps, the trival final piece of #7075 that I've been
working on.
There's more work to be done:
- push more of the setup into the pacakges themselves
- move channel-based sending/filtering out of the
- simplify the buffering throuhgout the p2p stack.
This change removes the partial gRPC interface to the RPC service, which was
deprecated in resolution of #6718.
Details:
- rpc: Remove the client and server interfaces and proto definitions.
- Remove the gRPC settings from the config library.
- Remove gRPC setup for the RPC service in the node startup.
- Fix various test helpers to remove gRPC bits.
- Remove the --rpc.grpc-laddr flag from the CLI.
Note that to satisfy the protobuf interface check, this change also includes a
temporary edit to buf.yaml, that I will revert after this is merged.
This PR adds an initial set of metrics for use ABCI. The initial metrics enable the calculation of timing histograms and call counts for each of the ABCI methods. The metrics are also labeled as either 'sync' or 'async' to determine if the method call was performed using ABCI's `*Async` methods.
An example of these metrics is included here for reference:
```
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0001"} 0
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0004"} 5
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.002"} 12
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.009"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.02"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.1"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.65"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="2"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="6"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="25"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="+Inf"} 13
tendermint_abci_connection_method_timing_sum{chain_id="ci",method="commit",type="sync"} 0.007802058000000001
tendermint_abci_connection_method_timing_count{chain_id="ci",method="commit",type="sync"} 13
```
These metrics can easily be graphed using prometheus's `histogram_quantile(...)` method to pick out a particular quantile to graph or examine. I chose buckets that were somewhat of an estimate of expected range of times for ABCI operations. They start at .0001 seconds and range to 25 seconds. The hope is that this range captures enough possible times to be useful for us and operators.
A few notes:
- this is not all the deletion that we can do, but this is the most
"simple" case: it leaves in shims, and there's some trivial
additional cleanup to the transport that can happen but that
requires writing more code, and I wanted this to be easy to review
above all else.
- This should land *after* we cut the branch for 0.35, but I'm
anticipating that to happen soon, and I wanted to run this through
CI.
This commit should be one of the first to land as part of the v0.36
cycle *after* cutting the 0.35 branch.
The blocksync/v2 reactor was originally implemented as an experiement
to produce an implementation of the blockstack protocol that would be
easier to test and validate, but it was never appropriately
operationalized and this implementation was never fully debugged. When
the p2p layer was refactored as part of the 0.35 cycle, the v2
implementation was not refactored and it was left in the codebase but
not removed. This commit just removes all references to it.
The code in the Tendermint repository makes heavy use of import aliasing.
This is made necessary by our extensive reuse of common base package names, and
by repetition of similar names across different subdirectories.
Unfortunately we have not been very consistent about which packages we alias in
various circumstances, and the aliases we use vary. In the spirit of the advice
in the style guide and https://github.com/golang/go/wiki/CodeReviewComments#imports,
his change makes an effort to clean up and normalize import aliasing.
This change makes no API or behavioral changes. It is a pure cleanup intended
o help make the code more readable to developers (including myself) trying to
understand what is being imported where.
Only unexported names have been modified, and the changes were generated and
applied mechanically with gofmt -r and comby, respecting the lexical and
syntactic rules of Go. Even so, I did not fix every inconsistency. Where the
changes would be too disruptive, I left it alone.
The principles I followed in this cleanup are:
- Remove aliases that restate the package name.
- Remove aliases where the base package name is unambiguous.
- Move overly-terse abbreviations from the import to the usage site.
- Fix lexical issues (remove underscores, remove capitalization).
- Fix import groupings to more closely match the style guide.
- Group blank (side-effecting) imports and ensure they are commented.
- Add aliases to multiple imports with the same base package name.
The 0.35 release cycle renamed the 'fastsync' functionality to 'blocksync'. This change brings the configuration parameters in line with that change. Namely, it updates the configuration file `[fastsync]` field to be `[blocksync]` and changes the command line flag and config file parameters `--fast-sync` and `fast-sync` to `--enable-block-sync` and `enable-block-sync` respectively.
Error messages were added to help users encountering these changes be able to quickly make the needed update to their files/scripts.
When using the old command line argument for fast-sync, the following is printed
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false --fast-sync=false
ERROR: invalid argument "false" for "--fast-sync" flag: --fast-sync has been deprecated, please use --enable-block-sync
```
When using one of the old config file parameters, the following is printed:
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false
ERROR: error in config file: a configuration parameter named 'fast-sync' was found in the configuration file. The 'fast-sync' parameter has been renamed to 'enable-block-sync', please update the 'fast-sync' field in your configuration file to 'enable-block-sync'
```
This is just a configuration change to default to using the new stack
unless explicitly disabled (e.g. `UseLegacy`) this renames the
configuration value and makes the configuration logic more clear.
The legacy option is good to retain as a fallback if the new stack has
issues operationally, but we should make sure that most of the time
we're using the new stack.
EDIT: Updated, see [comment below]( https://github.com/tendermint/tendermint/pull/6785#issuecomment-897793175)
This change adds a sketch of the `Debug` mode.
This change adds a `Debug` struct to the node package. This `Debug` struct is intended to be created and started by a command in the `cmd` directory. The `Debug` struct runs the RPC server on the data directories: both the state store and the block store.
This change required a good deal of refactoring. Namely, a new `rpc.go` file was added to the `node` package. This file encapsulates functions for starting RPC servers used by nodes. A potential additional change is to further factor this code into shared code _in_ the `rpc` package.
Minor API tweaks were also made that seemed appropriate such as the mechanism for fetching routes from the `rpc/core` package.
Additional work is required to register the `Debug` service as a command in the `cmd` directory but I am looking for feedback on if this direction seems appropriate before diving much further.
closes: #5908
closes#2498
solves part of #3365
Note: difficult to test the event emit in SwitchToFastSync part, might need to change `stateSyncReactor` to an interface in the `nodeImpl` struct
## Description
Expose p2p functions for use in the sdk.
These functions could also be copied over to the sdk. I dont have a preference of which is better.
This PR make some tweaks to backfill after running e2e tests:
- Separates sync and backfill as two distinct processes that the node calls. The reason is because if sync fails then the node should fail but if backfill fails it is still possible to proceed.
- Removes peers who don't have the block at a height from the local peer list. As the process goes backwards if a node doesn't have a block at a height they're likely pruning blocks and thus they won't have any prior ones either.
- Sleep when we've run out of peers, then try again.