The main (and minor) win of this PR is that the transport is fully the
responsibility of the router and the node doesn't need to be responsible for its lifecylce.
I saw one of these tests fail and it looks like it was using code that
wasn't being called anywhere, so I deleted it, and avoided the package
name aliasing.
We stopped testing these configurations a while ago, and it doesn't
really make sense to allow nodes to run in this configuration. This
drops support for non-blocksync nodes and cleans up the
configuration/tests accordingly.
Closes: #6908
This pull request adds a new "mesage_type" label to the send/recv bytes metrics calculated in the p2p code.
Below is a snippet of the updated metrics that includes the updated label:
```
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 652
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="4b1068420ef739db63377250553562b9a978708a"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 393
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="4b1068420ef739db63377250553562b9a978708a"} 357
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 386
```
Rework the internal plumbing of the server. This change does not modify the
exported interfaces or semantics of the package, and all the existing tests
still pass.
The main changes here are to:
- Simplify the interface for subscription indexing with a typed index rather
than a single nested map.
- Ensure orderly shutdown of channels, so that there is no longer a dynamic
race with concurrent publishers & subscribers at shutdown.
- Remove a layer of indirection between publishers and subscribers. This mainly
helps legibility.
- Remove order dependencies between registration and delivery.
- Add documentation comments where they seemed helpful, and clarified the
existing comments where it was practical.
Although performance was not a primary goal of this change, the simplifications
did very slightly reduce memory use and increase throughput on the existing
benchmarks, though the delta is not statistically significant.
BENCHMARK BEFORE AFTER SPEEDUP (%) B/op (B) B/op (A)
Benchmark10Clients-12 5947 5566 6.4 2017 1942
Benchmark100Clients-12 6111 5762 5.7 1992 1910
Benchmark1000Clients-12 6983 6344 9.2 2046 1959
This pull request fixes a panic that exists in both mempools. The panic occurs when the ABCI client misses a response from the ABCI application. This happen when the ABCI client drops the request as a result of a full client queue. The fix here was to loop through the ordered list of recheck-tx in the callback until one matches the currently observed recheck request.
This is, perhaps, the trival final piece of #7075 that I've been
working on.
There's more work to be done:
- push more of the setup into the pacakges themselves
- move channel-based sending/filtering out of the
- simplify the buffering throuhgout the p2p stack.
This patch was needed to pass the buf breakage check for the proto file removed
in #7121, but now that master contains the change we no longer need the patch.
This change removes the partial gRPC interface to the RPC service, which was
deprecated in resolution of #6718.
Details:
- rpc: Remove the client and server interfaces and proto definitions.
- Remove the gRPC settings from the config library.
- Remove gRPC setup for the RPC service in the node startup.
- Fix various test helpers to remove gRPC bits.
- Remove the --rpc.grpc-laddr flag from the CLI.
Note that to satisfy the protobuf interface check, this change also includes a
temporary edit to buf.yaml, that I will revert after this is merged.
This metric describes itself as 'pending' but never actual decrements when the messages are removed from the queue.
This change fixes that by decrementing the metric when the data is removed from the queue.
This PR adds an initial set of metrics for use ABCI. The initial metrics enable the calculation of timing histograms and call counts for each of the ABCI methods. The metrics are also labeled as either 'sync' or 'async' to determine if the method call was performed using ABCI's `*Async` methods.
An example of these metrics is included here for reference:
```
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0001"} 0
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0004"} 5
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.002"} 12
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.009"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.02"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.1"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.65"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="2"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="6"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="25"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="+Inf"} 13
tendermint_abci_connection_method_timing_sum{chain_id="ci",method="commit",type="sync"} 0.007802058000000001
tendermint_abci_connection_method_timing_count{chain_id="ci",method="commit",type="sync"} 13
```
These metrics can easily be graphed using prometheus's `histogram_quantile(...)` method to pick out a particular quantile to graph or examine. I chose buckets that were somewhat of an estimate of expected range of times for ABCI operations. They start at .0001 seconds and range to 25 seconds. The hope is that this range captures enough possible times to be useful for us and operators.
Fixes#7068. The build-docker rule relies on being able to run make
build-linux, but did not pull the Makefile into the build context.
There are various ways to fix this, but this was probably the smallest.
* ci: use run-multiple.sh for e2e pr tests
* fix labeling
* Update .github/workflows/e2e-nightly-35x.yml
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Fixes#7098. The light client documentation moved to the spec repository.
I was not able to figure out what happened to light-client-protocol.md, it was removed in #5252 but no corresponding file exists in the spec repository. Since the spec also discusses the protocol, this change simply links to the spec and removes the non-functional reference.
Alternatively we could link to the top-level [light client doc](https://docs.tendermint.com/master/tendermint-core/light-client.html) if you think that's better.
This tweaks the connectivity of test configurations, in hopes that more will be viable.
Additionally reduces the prevalence of testing the legacy mempool.
My earlier p2p cleanup code removed support for the p2p tests from the
e2e generator and runner, but missed removing the CI
configuration. This patch remedies that.