Add deprecation logs when websocket is enabled
As promised in ADR 075, this causes the node to log (without error) when
websocket transport is enabled, and also when subscribers connect.
Closes#7073
As part of the 0.36 cycle we've discussed and decided to remove the mutex in tendermint that protects the ABCI application. First, applications should be able to be responsible for their own concurrency control, and can make more fine-grained decisions about concurrent use than tendermint ever could. Second, I've observed in recent weeks as we've been making this change that the mutex wasn't applied particularly consistently in many cases (e.g. multiple "local" connections to the application had multiple locks, etc.) so this will give more consistent experiences across ABCI execution environments, and simplifies the tendermint ABCI handling code.
- Update documentation to deprecate the old methods.
- Add Events methods to HTTP, WS, and Local clients.
- Add Events method to the light client wrapper.
- Rename legacy events client to SubscriptionClient.
* testing: reduce usage of the MustDefualtLogger constructor
* Apply suggestions from code review
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
* cleanup tests
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
This method implements the eventlog extension interface to expose ABCI metadata
to the log for query processing. Only the types that have ABCI events need to
implement this.
- Add an event log to the environment
- Add a sketch of the handler method
- Add an /events RPCFunc to the route map
- Implement query logic
- Subscribe to pubsub if confingured, handle termination
This is the first step in removing the mutex from ABCI applications:
making our test applications hold mutexes, which this does, hopefully
with zero impact. If this lands well, then we can explore deleting the
other mutexes (in the ABCI server and the clients.) While this change
is not user impacting at all, removing the other mutexes *will* be.
In persuit of this, I've changed the KV app somewhat, to put almost
all of the logic in the base application and make the persistent
application mostly be a wrapper on top of that with a different
storage layer.
The previous implementation of the *test* was flaky, and this irons
out some of those problems. The primary assertion that was failing
(less than 1% of the time) was an error on close that I think we
shouldn't care about.
* Update OpenAPI docs.
- Add an Events tag for event methods.
- Add schema entries for event request/response types.
- Clarify the documentation for broadcast methods.
- Note that websocket will be deprecated in v0.36.
There are a lot of existing links to the master section of the site, and my
attempts to get a redirector working have so far not succeeded. While it still
makes sense to not publish docs for unreleased code, a 404 is almost certainly
more disruptive than seeing docs for unreleased stuff.
This includes the docs in the build again, but does not add them back to the
selector menu. That allows URLs to resolve but encourages folks to use the
released versions when they have a choice.
I left the redirect for the RPC link in place, since that's still useful.
Updates #7935.
Implement the basic cursor and eventlog types described in ADR 075. Handle
encoding and decoding as strings for compatibility with JSON.
- Add unit tests for the required order and synchronization properties.
- Add hooks for metrics, with one value to be expanded later.
- Update ADR 075 to match the specifics of the implementation so far.
* event: Added Events after evidence validation; evidence: refactored AddEvidence
Added context and Metrics as parameter for the pool constructor
* evidence: pushed event firing into evidence pool and added metrics to represent the size of the evpool
* state: fixed parameters of evpool mock functions
* evidence: added test to confirm events are generated
* Removed obsolete EvidenceEventPublisher interface
* evidence: pool removed error on missing eventbus
Went through #2871, there are several issues, this PR tries to tackle the `HasVoteMessage` with an invalid validator index sent by a bad peer and it prevents the bad vote goes to the peerMsgQueue.
Future work, check other bad message cases and plumbing the reactor errors with the peer manager and then can disconnect the peer sending the bad messages.
I think in the future we should migrate from having our own logging
interface and use our logger directly, which I think would help
obviate this particular problem, but in the mean time, this seems safe.
## Summary
This pull request adds a default set of values to the new Synchrony parameters. These values were chosen by after observation of three live networks: Emoney, Osmosis, and the Cosmos Hub.
For the default Precision value, `505ms` was selected. The reasoning for this is summarized in https://github.com/tendermint/tendermint/issues/7724
For each observed chain, an experimental Message Delay was collected over a 24 hour period and an average over this period was calculated using this data. Values over 10s were considered outliers and treated separately for the average since the majority of observations were far below 10s. The message delay was calculated both for the quorum and the 'full' prevote. Description of the technique for collecting the experimental values can found in #7202. This value is calculated only using timestamps given by processes on the network, so large variation in values is almost certainly due to clock skew among the validator set.
`12s` is proposed for the default MessageDelay value. This value would easily accomodates all non-outlier values, allowing even E-money's 4.25s value to be valid. This would also allow some validators with skewed clocks to still participate without allowing for huge variation in the timestamps produced by the network. Additionally, for the currently listed use-cases of PBTS, such as unbonding period, and light client trust period, the current bounds for these are in weeks. Adding a few seconds of tolerance by default is therefore unlikely to have serious side-effects.
## Data
### Cosmos Hub
Observation Period: 2022-02-03 20:22-2022-02-04 20:22
Avg Full Prevote Message Delay: 1.27s
Outliers: 11s,13s,50s,106s,144s
Total Outlier Heights: 86
Avg Quorum Prevote Message Delay: .77s
Outliers: 10s,14s,107s,144s
Total Outlier Heights: 617
Total heights: 11528
### Osmosis
Observation Period: 2022-01-29 20:26-2022-01-28 20:26
Avg Quorum Prevote Message Delay: .46s
Outliers: 21s,50s
Total Outlier Heights: 26
NOTE: During the observation period, a 'full' prevote was not observed.
Total heights: 13983
### E-Money
Observation Period: 2022-02-07 04:29-2022-02-08 04:29
Avg Full Prevote Message Delay: 4.25s
Outliers: 12s,15s,39s
Total Outlier Heights: 128
Avg Quorum Prevote Message Delay: .20s
Outliers: 28s
Total Outlier Heights: 15
Total heights: 3791
When testing rollback feature in the Cosmos SDK, we found that the app hash
in Tendermint after rollback was the value after the latest block, rather than
before it.
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
While I'd hoped to be able to make the socket client less weird, I
think that this is a nice middle ground in terms of improving
readability and removing the vestigal components without breaking
anything or radically changing the underlying assumptions.
In the future we'd want to have requests be identified by a request
ID, and then we could drop the request tracking logic in the client
entirely, and this is protocol breaking. The alternatives aren't
substantively different than the current implementation.
This change changes the ABCI socket client to allow goroutines to block writing to the internal queue. This has the effect ensuring that callers of the ABCI methods do not error on a full internal queue at the expense of allowing the number of goroutines waiting on this internal queue to grow in an unbounded fashion. This tradeoff seems preferable since it allows callers of the ABCI methods to be certain that a request that was made will reach the application if it is available.
Closes: #7827
This change was initially implemented here: e13b4386ff and never landed on v0.34, only v0.35+
This follows along in the spirit of #7845 but is orthogonal to
removing `CheckTxAsync` (which will come after the previous commit
lands,) so I thought I'd get it out there earlier.
The main function defers some things that do not run in the "normal" exit case
because we call os.Exit(0) explicitly. Since falling off the end of main does
the same thing, and also permits defers to run, let's do that.
After poking around #7828, I saw the oppertunity for this cleanup,
which I think is both reasonable on its own, and quite low impact, and
removes the math around process start time.