This change adds logic to double the message delay bound after every 10 rounds. Alternatives to this somewhat magic number were discussed. Specifically, whether or not to make '10' modifiable as a parameter was discussed. Since this behavior only exists to ensure liveness in the case that these values were poorly chosen to begin with, a method to configure this value was not created. Chains that notice many 'untimely' rounds per the [relevant metric](https://github.com/tendermint/tendermint/pull/7709) are expected to take action to increase the configured message delay to more accurately match the conditions of the network.
closes: https://github.com/tendermint/spec/issues/371
This pull request merges in the changes for implementing Proposer-based timestamps into `master`. The power was primarily being done in the `wb/proposer-based-timestamps` branch, with changes being merged into that branch during development. This pull request represents an amalgamation of the changes made into that development branch. All of the changes that were placed into that branch have been cleanly rebased on top of the latest `master`. The changes compile and the tests pass insofar as our tests in general pass.
### Note To Reviewers
These changes have been extensively reviewed during development. There is not much new here. In the interest of making effective use of time, I would recommend against trying to perform a complete audit of the changes presented and instead examine for mistakes that may have occurred during the process of rebasing the changes. I gave the complete change set a first pass for any issues, but additional eyes would be very appreciated.
In sum, this change set does the following:
closes#6942
merges in #6849
There are no further uses of this package anywhere in Tendermint.
All the uses in the Cosmos SDK are for types that now work correctly with the
standard encoding/json package.
## What does this pull request do?
This pull requests adds two metrics intended for use in calculating an experimental value for `MessageDelay`.
The metrics are as follows:
```
# HELP tendermint_consensus_complete_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved 100% of the voting power in the prevote step.
# TYPE tendermint_consensus_complete_prevote_message_delay gauge
tendermint_consensus_complete_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505
# HELP tendermint_consensus_quorum_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved a quorum in the prevote step.
# TYPE tendermint_consensus_quorum_prevote_message_delay gauge
tendermint_consensus_quorum_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505
```
## Why this change?
For more information on what these metrics are calculating, see #7202. The aim is to merge to backport these metrics to v0.34 and run nodes on a few popular chains with these metrics to determine the experimental values for `MessageDelay` on these popular chains and use these to select our default `SynchronyParams.MessageDelay` value.
## Why Gauges for the metrics?
Gauges allow us to overwrite the metric on each successive observation. We can then capture these metrics over time to track the highest and lowest observed value.
Where possible, replace uses of the custom JSON library with the standard
library. The custom library treats interface and unnamed lteral types
differently, so this change avoids those even where it would probably be safe
to switch them.
This continues the push of plumbing contexts through tendermint. I
attempted to find all goroutines in the production code (non-test) and
made sure that these threads would exit when their contexts were
canceled, and I believe this PR does that.
This is a very small change, but removes a method from the
`service.Service` interface (a win!) and forces callers to explicitly
pass loggers in to objects during construction rather than (later)
injecting them. There's not a real need for this kind of lazy
construction of loggers, and I think a decent potential for confusion
for mutable loggers.
The main concern I have is that this changes the constructor API for
ABCI clients. I think this is fine, and I suspect that as we plumb
contexts through, and make changes to the RPC services there'll be a
number of similar sorts of changes to various (quasi) public
interfaces, which I think we should welcome.
This is part of the work described by #7156.
Remove "unbuffered subscriptions" from the pubsub service.
Replace them with a dedicated blocking "observer" mechanism.
Use the observer mechanism for indexing.
Add a SubscribeWithArgs method and deprecate the old Subscribe
method. Remove SubscribeUnbuffered entirely (breaking).
Rework the Subscription interface to eliminate exposed channels.
Subscriptions now use a context to manage lifecycle notifications.
Internalize the eventbus package.
The code in the Tendermint repository makes heavy use of import aliasing.
This is made necessary by our extensive reuse of common base package names, and
by repetition of similar names across different subdirectories.
Unfortunately we have not been very consistent about which packages we alias in
various circumstances, and the aliases we use vary. In the spirit of the advice
in the style guide and https://github.com/golang/go/wiki/CodeReviewComments#imports,
his change makes an effort to clean up and normalize import aliasing.
This change makes no API or behavioral changes. It is a pure cleanup intended
o help make the code more readable to developers (including myself) trying to
understand what is being imported where.
Only unexported names have been modified, and the changes were generated and
applied mechanically with gofmt -r and comby, respecting the lexical and
syntactic rules of Go. Even so, I did not fix every inconsistency. Where the
changes would be too disruptive, I left it alone.
The principles I followed in this cleanup are:
- Remove aliases that restate the package name.
- Remove aliases where the base package name is unambiguous.
- Move overly-terse abbreviations from the import to the usage site.
- Fix lexical issues (remove underscores, remove capitalization).
- Fix import groupings to more closely match the style guide.
- Group blank (side-effecting) imports and ensure they are commented.
- Add aliases to multiple imports with the same base package name.
This document attempts to capture and discuss some of the areas of Tendermint that seem to be cited as causing performance issue. I'm hoping to continue to gather feedback and input on this document to better understand what issues Tendermint performance may cause for our users.
The overall goal of this document is to allow the maintainers and community to get a better sense of these issues and to be more capably able to discuss them and weight trade-offs about any proposed performance-focused changes. This document does not aim to propose any performance improvements. It does suggest useful places for benchmarks and places where additional metrics would be useful for diagnosing and further understanding Tendermint performance.
Please comment with areas where my reasoning seems off or with additional areas that Tendermint performance may be causing user pain.
Issues reported in Osmosis, where the message is extremely long. Also, there is absolutely no reason to log the message IMO. If we must, we can make the message log DEBUG.