* libs/log: remove Must constructor
* Update test/e2e/node/main.go
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
* use stdlog
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Updates #8077. The panic handler for consensus currently attempts to effect a
clean shutdown, but this can leave a failed node running in an unknown state
for an arbitrary amount of time after the failure.
Since a panic at this point means consensus is already irrecoverably broken, we
should not allow the node to continue executing. After making a best effort to
shut down the writeahead log, re-panic to ensure the node will terminate before
any further state transitions are processed.
Even with this change, it is possible some transitions may occur while the
cleanup is happening. It might be preferable to abort unconditionally without
any attempt at cleanup.
Related changes:
- Clean up the creation of WAL directories.
- Filter WAL close errors at rethrow.
The message handling in this reactor is all under control of the reactor
itself, and does not call out to callbacks or other externally-supplied code.
It doesn't need to check for panics.
- Remove an irrelevant channel ID check.
- Remove an unnecessary panic recovery wrapper.
The PEX reactor has a simple feedback control mechanism to decide how often to
poll peers for peer address updates. The idea is to poll more frequently when
knowledge of the network is less, and decrease frequency as knowledge grows.
This change solves two problems:
1. It is possible in some cases we may poll a peer "too often" and get dropped
by that peer for spamming.
2. The first successful peer update with any content resets the polling timer
to a very long time (10m), meaning if we are unlucky in getting an
incomplete reply while the network is small, we may not try again for a very
long time. This may contribute to difficulties bootstrapping sync.
The main change here is to only update the interval when new information is
added to the system, and not (as before) whenever a request is sent out to a
peer. The rate computation is essentially the same as before, although the code
has been a bit simplified, and I consolidated some of the error handling so
that we don't have to check in multiple places for the same conditions.
Related changes:
- Improve error diagnostics for too-soon and overflow conditions.
- Clean up state handling in the poll interval computation.
- Pin the minimum interval avert a chance of PEX spamming a peer.
This change set implements the most recent version of `FinalizeBlock`.
# What does this change actually contain?
* This change set is rather large but fear not! The majority of the files touched and changes are renaming `ResponseDeliverTx` to `ExecTxResult`. This should be a pretty inoffensive change since they're effectively the same type but with a different name.
* The `execBlockOnProxyApp` was totally removed since it served as just a wrapper around the logic that is now mostly encapsulated within `FinalizeBlock`
* The `updateState` helper function has been made a public method on `State`. It was being exposed as a shim through the testing infrastructure, so this seemed innocuous.
* Tests already existed to ensure that the application received the `ByzantineValidators` and the `ValidatorUpdates`, but one was fixed up to ensure that `LastCommitInfo` was being sent across.
* Tests were removed from the `psql` indexer that seemed to search for an event in the indexer that was not being created.
# Questions for reviewers
* We store this [ABCIResponses](5721a13ab1/proto/tendermint/state/types.pb.go (L37)) type in the data base as the block results. This type has changed since v0.35 to contain the `FinalizeBlock` response. I'm wondering if we need to do any shimming to keep the old data retrieveable?
* Similarly, this change is exposed via the RPC through [ResultBlockResults](5721a13ab1/rpc/coretypes/responses.go (L69)) changing. Should we somehow shim or notify for this change?
closes: #7658
Since the goal of reading events at the head of the event log is to satisfy a
subscription style interface, there is no point in allowing head polling with
no wait interval. The pagination case already bypasses long polling, so the
extra option is unneessary.
Set a minimum default long-polling interval for the head case.
Add a test for minimum delay.
This is mostly an extremely small change where I double a somewhat
arbitrarly set timeout from 1m to 2m for an entire test. When I put
these timeouts in the test, they were arbitrary based on my local
performance (which is quite fact,) and I expected that they'd need to
be tweaked in the future.
A big chunk of this PR is reworking a collection of helper functions
that produce somewhat intractable messages when a test fails, so that
the error messages take up less vertical space, hopefully without
losing any debugability.
We're waiting between trying witnesses (which shouldn't be neccessary
because the witnesses shouldn't depend on each other,) and also
between *attempts*, and really the outer sleep should be enough.
This is a little coarse, but the idea is that we'll send information
about the channels a peer has upon the peer-up event that we send to
reactors that we can then use to reject peers (if neeeded) from reactors.
This solves the problem where statesync would hang in test networks
(and presumably real) where we would attempt to statesync from seed
nodes, thereby hanging silently forever.
This change implements the spec for `ProcessProposal`. It first calls the Tendermint block validation logic to check that all of the proposed block fields are well formed and do not violate any of the rules for Tendermint to consider the block valid and then passes the validated block the `ProcessProposal`.
This change also adds additional fixtures to test the change. It adds the `baseMock` types that holds a mock as well as a reference to `BaseApplication`. If the function was not setup by the test on the contained mock Application, the type delegates to the `BaseApplication` and returns what `BaseApplication` returns.
The change also switches the `makeState` helper to take an arg struct so that an ABCI application can be plumbed through when needed.
closes: #7656
* p2p: mconn track last message for pongs
* fix spell
* cr feedback
* test fix part one
* cleanup tests
* fix comment
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
Add deprecation logs when websocket is enabled
As promised in ADR 075, this causes the node to log (without error) when
websocket transport is enabled, and also when subscribers connect.
This method implements the eventlog extension interface to expose ABCI metadata
to the log for query processing. Only the types that have ABCI events need to
implement this.
- Add an event log to the environment
- Add a sketch of the handler method
- Add an /events RPCFunc to the route map
- Implement query logic
- Subscribe to pubsub if confingured, handle termination
This is the first step in removing the mutex from ABCI applications:
making our test applications hold mutexes, which this does, hopefully
with zero impact. If this lands well, then we can explore deleting the
other mutexes (in the ABCI server and the clients.) While this change
is not user impacting at all, removing the other mutexes *will* be.
In persuit of this, I've changed the KV app somewhat, to put almost
all of the logic in the base application and make the persistent
application mostly be a wrapper on top of that with a different
storage layer.
The previous implementation of the *test* was flaky, and this irons
out some of those problems. The primary assertion that was failing
(less than 1% of the time) was an error on close that I think we
shouldn't care about.
Implement the basic cursor and eventlog types described in ADR 075. Handle
encoding and decoding as strings for compatibility with JSON.
- Add unit tests for the required order and synchronization properties.
- Add hooks for metrics, with one value to be expanded later.
- Update ADR 075 to match the specifics of the implementation so far.
* event: Added Events after evidence validation; evidence: refactored AddEvidence
Added context and Metrics as parameter for the pool constructor
* evidence: pushed event firing into evidence pool and added metrics to represent the size of the evpool
* state: fixed parameters of evpool mock functions
* evidence: added test to confirm events are generated
* Removed obsolete EvidenceEventPublisher interface
* evidence: pool removed error on missing eventbus
Went through #2871, there are several issues, this PR tries to tackle the `HasVoteMessage` with an invalid validator index sent by a bad peer and it prevents the bad vote goes to the peerMsgQueue.
Future work, check other bad message cases and plumbing the reactor errors with the peer manager and then can disconnect the peer sending the bad messages.
When testing rollback feature in the Cosmos SDK, we found that the app hash
in Tendermint after rollback was the value after the latest block, rather than
before it.
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
While I'd hoped to be able to make the socket client less weird, I
think that this is a nice middle ground in terms of improving
readability and removing the vestigal components without breaking
anything or radically changing the underlying assumptions.
In the future we'd want to have requests be identified by a request
ID, and then we could drop the request tracking logic in the client
entirely, and this is protocol breaking. The alternatives aren't
substantively different than the current implementation.
This follows along in the spirit of #7845 but is orthogonal to
removing `CheckTxAsync` (which will come after the previous commit
lands,) so I thought I'd get it out there earlier.