Updates #8077. The panic handler for consensus currently attempts to effect a
clean shutdown, but this can leave a failed node running in an unknown state
for an arbitrary amount of time after the failure.
Since a panic at this point means consensus is already irrecoverably broken, we
should not allow the node to continue executing. After making a best effort to
shut down the writeahead log, re-panic to ensure the node will terminate before
any further state transitions are processed.
Even with this change, it is possible some transitions may occur while the
cleanup is happening. It might be preferable to abort unconditionally without
any attempt at cleanup.
Related changes:
- Clean up the creation of WAL directories.
- Filter WAL close errors at rethrow.
This change set implements the most recent version of `FinalizeBlock`.
# What does this change actually contain?
* This change set is rather large but fear not! The majority of the files touched and changes are renaming `ResponseDeliverTx` to `ExecTxResult`. This should be a pretty inoffensive change since they're effectively the same type but with a different name.
* The `execBlockOnProxyApp` was totally removed since it served as just a wrapper around the logic that is now mostly encapsulated within `FinalizeBlock`
* The `updateState` helper function has been made a public method on `State`. It was being exposed as a shim through the testing infrastructure, so this seemed innocuous.
* Tests already existed to ensure that the application received the `ByzantineValidators` and the `ValidatorUpdates`, but one was fixed up to ensure that `LastCommitInfo` was being sent across.
* Tests were removed from the `psql` indexer that seemed to search for an event in the indexer that was not being created.
# Questions for reviewers
* We store this [ABCIResponses](5721a13ab1/proto/tendermint/state/types.pb.go (L37)) type in the data base as the block results. This type has changed since v0.35 to contain the `FinalizeBlock` response. I'm wondering if we need to do any shimming to keep the old data retrieveable?
* Similarly, this change is exposed via the RPC through [ResultBlockResults](5721a13ab1/rpc/coretypes/responses.go (L69)) changing. Should we somehow shim or notify for this change?
closes: #7658
This is mostly an extremely small change where I double a somewhat
arbitrarly set timeout from 1m to 2m for an entire test. When I put
these timeouts in the test, they were arbitrary based on my local
performance (which is quite fact,) and I expected that they'd need to
be tweaked in the future.
A big chunk of this PR is reworking a collection of helper functions
that produce somewhat intractable messages when a test fails, so that
the error messages take up less vertical space, hopefully without
losing any debugability.
This change implements the spec for `ProcessProposal`. It first calls the Tendermint block validation logic to check that all of the proposed block fields are well formed and do not violate any of the rules for Tendermint to consider the block valid and then passes the validated block the `ProcessProposal`.
This change also adds additional fixtures to test the change. It adds the `baseMock` types that holds a mock as well as a reference to `BaseApplication`. If the function was not setup by the test on the contained mock Application, the type delegates to the `BaseApplication` and returns what `BaseApplication` returns.
The change also switches the `makeState` helper to take an arg struct so that an ABCI application can be plumbed through when needed.
closes: #7656
This is the first step in removing the mutex from ABCI applications:
making our test applications hold mutexes, which this does, hopefully
with zero impact. If this lands well, then we can explore deleting the
other mutexes (in the ABCI server and the clients.) While this change
is not user impacting at all, removing the other mutexes *will* be.
In persuit of this, I've changed the KV app somewhat, to put almost
all of the logic in the base application and make the persistent
application mostly be a wrapper on top of that with a different
storage layer.
* event: Added Events after evidence validation; evidence: refactored AddEvidence
Added context and Metrics as parameter for the pool constructor
* evidence: pushed event firing into evidence pool and added metrics to represent the size of the evpool
* state: fixed parameters of evpool mock functions
* evidence: added test to confirm events are generated
* Removed obsolete EvidenceEventPublisher interface
* evidence: pool removed error on missing eventbus
Went through #2871, there are several issues, this PR tries to tackle the `HasVoteMessage` with an invalid validator index sent by a bad peer and it prevents the bad vote goes to the peerMsgQueue.
Future work, check other bad message cases and plumbing the reactor errors with the peer manager and then can disconnect the peer sending the bad messages.
* Rebased and git-squashed the commits in PR #6546
migrate abci to finalizeBlock
work on abci, proxy and mempool
abciresponse, blok events, indexer, some tests
fix some tests
fix errors
fix errors in abci
fix tests amd errors
* Fixes after rebasing PR#6546
* Restored height to RequestFinalizeBlock & other
* Fixed more UTs
* Fixed kvstore
* More UT fixes
* last TC fixed
* make format
* Update internal/consensus/mempool_test.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Addressed @williambanfield's comments
* Fixed UTs
* Addressed last comments from @williambanfield
* make format
Co-authored-by: marbar3778 <marbar3778@yahoo.com>
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Our test cases spew a lot of files and directories around $TMPDIR. Make more
thorough use of the testing package's TempDir methods to ensure these are
cleaned up.
In a few cases, this required plumbing test contexts through existing helper
code. In a couple places an explicit path was required, to work around cases
where we do global setup during a TestMain function. Those cases probably
deserve more thorough cleansing (preferably with fire), but for now I have just
worked around it to keep focused on the cleanup.
This change adds logic to double the message delay bound after every 10 rounds. Alternatives to this somewhat magic number were discussed. Specifically, whether or not to make '10' modifiable as a parameter was discussed. Since this behavior only exists to ensure liveness in the case that these values were poorly chosen to begin with, a method to configure this value was not created. Chains that notice many 'untimely' rounds per the [relevant metric](https://github.com/tendermint/tendermint/pull/7709) are expected to take action to increase the configured message delay to more accurately match the conditions of the network.
closes: https://github.com/tendermint/spec/issues/371
This is clearly a cob-web in the code, and may predict a solution to #7729, though this is difficult to backport because we don't have contexts in 0.35
This pull request merges in the changes for implementing Proposer-based timestamps into `master`. The power was primarily being done in the `wb/proposer-based-timestamps` branch, with changes being merged into that branch during development. This pull request represents an amalgamation of the changes made into that development branch. All of the changes that were placed into that branch have been cleanly rebased on top of the latest `master`. The changes compile and the tests pass insofar as our tests in general pass.
### Note To Reviewers
These changes have been extensively reviewed during development. There is not much new here. In the interest of making effective use of time, I would recommend against trying to perform a complete audit of the changes presented and instead examine for mistakes that may have occurred during the process of rebasing the changes. I gave the complete change set a first pass for any issues, but additional eyes would be very appreciated.
In sum, this change set does the following:
closes#6942
merges in #6849
Remove the pubsub.Query interface and instead use the concrete query type.
Nothing uses any other implementation but pubsub/query.
* query: remove the error from the Matches method
* Update all usage.
There are no further uses of this package anywhere in Tendermint.
All the uses in the Cosmos SDK are for types that now work correctly with the
standard encoding/json package.
The main change here is to use encoding/json to encode and decode RPC
parameters, rather than the custom tmjson package. This includes:
- Update the HTTP POST handler parameter handling.
- Add field tags to 64-bit integer types to get string encoding (to match amino/tmjson).
- Add marshalers to struct types that mention interfaces.
- Inject wrappers to decode interface arguments in RPC handlers.