I see this panic in tests occasionally, and I don't think there's any
need to close this channel:
- it's only sent to in one place which has a select case with a
default clause, so there's no chance of deadlocks.
- the only place we recieve from it thas a timeout.
This contains two major changes:
- Remove the legacy test logging method, and just explicitly call the
noop logger. This is just to make the test logging behavior more
coherent and clear.
- Move the logging in the light package from the testing.T logger to
the noop logger. It's really the case that we very rarely need/want
to consider test logs unless we're doing reproductions and running a
narrow set of tests.
In most cases, I (for one) prefer to run in verbose mode so I can
watch progress of tests, but I basically never need to consider
logs. If I do want to see logs, then I can edit in the testing.T
logger locally (which is what you have to do today, anyway.)
This is failing intermittently, but it's a really simple test, and I
suspect that we're just running into thread scheduling issues on CI
nodes. I don't think making the test smaller reduces the utility of
this test.
closes: #7756
# What does this pull request change?
This pull request adds a new runbook for operators enountering errors related to the new Proposer-Based Timestamps algorithm. The goal of this runbook is to give operators a set of clear steps that they can follow if they are having issues producing blocks because of clock synchronization problems.
This pull request also renames the `*PrevoteDelay` metrics to drop the term `MessageDelay`. These metrics provide a combined view of `message_delay` + `synchrony` so the name may be confusing.
# Questions to reviewers
* Are there ways to make the set of steps clearer or are there any pieces that seem confusing?
The `.proto` file do not have the `nullable = false` annotation present on the `SynchronyParams` durations. This pull request updates the `SynchronyParams` to match the checked in proto files. Note, this does not make the code buildable against the latest protos. This pull request was achieved by checking out all files _not relevant_ to the `SynchronyParams` and removing the new `TimeoutParams` from the the `params.proto` file. Future updates will add these back.
This pull request also adds a `nil` check to the `pbParams.Synchrony` field in `ConsensusParamsFromProto`. Old versions of Tendermint will not have the `Synchrony` parameters filled in so this code would panic on startup.
We will fill in the empty fields with defaults, but per https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-009-consensus-parameter-upgrades.md#only-update-hashedparams-on-hash-breaking-releases we will keep out of the hash during this release.
This is minor, but I was trying to write a test and realized that the
application reference in the harness isn't actually used, which is
quite confusing.
The events switch code is largely vestigal and is responsible for
wiring between the consensus state machine and the consensus
reactor. While there might have been a need, historicallly to managed
these subscriptions at runtime, it's nolonger used: subscriptions are
registered during startup, and then the switch shuts down at at the
end.
Eventually the EventSwitch should be replaced by a much smaller
implementation of an eventloop in the consensus state machine, but
cutting down on the scope of the event switch will help clarify the
requirements from the consensus side.
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.7.0 to 1.7.1.
<details>
<summary>Commits</summary>
<ul>
<li><a href="083ff1c044"><code>083ff1c</code></a> Fixed didPanic to now detect panic(nil).</li>
<li><a href="1e36bfe104"><code>1e36bfe</code></a> Use cross Go version compatible build tag syntax</li>
<li><a href="e798dc2763"><code>e798dc2</code></a> Add docs on 1.17 build tags</li>
<li><a href="83198c2c50"><code>83198c2</code></a> assert: guard CanConvert call in backward compatible wrapper</li>
<li><a href="087b655c75"><code>087b655</code></a> assert: allow comparing time.Time</li>
<li><a href="7bcf74e94f"><code>7bcf74e</code></a> fix msgAndArgs forwarding</li>
<li><a href="c29de71342"><code>c29de71</code></a> add tests for correct msgAndArgs forwarding</li>
<li><a href="f87e2b2119"><code>f87e2b2</code></a> Update builds</li>
<li><a href="ab6dc32628"><code>ab6dc32</code></a> fix linting errors in /assert package</li>
<li><a href="edff5a049b"><code>edff5a0</code></a> fix funtion name</li>
<li>Additional commits viewable in <a href="https://github.com/stretchr/testify/compare/v1.7.0...v1.7.1">compare view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/stretchr/testify&package-manager=go_modules&previous-version=1.7.0&new-version=1.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
</details>
This change implements the logic for the PrepareProposal ABCI++ method call. The main logic for creating and issuing the PrepareProposal request lives in execution.go and is tested in a set of new tests in execution_test.go. This change also updates the mempool mock to use a mockery generated version and removes much of the plumbing for the no longer used ABCIResponses.
* libs/log: remove Must constructor
* Update test/e2e/node/main.go
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
* use stdlog
Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Updates #8077. The panic handler for consensus currently attempts to effect a
clean shutdown, but this can leave a failed node running in an unknown state
for an arbitrary amount of time after the failure.
Since a panic at this point means consensus is already irrecoverably broken, we
should not allow the node to continue executing. After making a best effort to
shut down the writeahead log, re-panic to ensure the node will terminate before
any further state transitions are processed.
Even with this change, it is possible some transitions may occur while the
cleanup is happening. It might be preferable to abort unconditionally without
any attempt at cleanup.
Related changes:
- Clean up the creation of WAL directories.
- Filter WAL close errors at rethrow.
The message handling in this reactor is all under control of the reactor
itself, and does not call out to callbacks or other externally-supplied code.
It doesn't need to check for panics.
- Remove an irrelevant channel ID check.
- Remove an unnecessary panic recovery wrapper.
The PEX reactor has a simple feedback control mechanism to decide how often to
poll peers for peer address updates. The idea is to poll more frequently when
knowledge of the network is less, and decrease frequency as knowledge grows.
This change solves two problems:
1. It is possible in some cases we may poll a peer "too often" and get dropped
by that peer for spamming.
2. The first successful peer update with any content resets the polling timer
to a very long time (10m), meaning if we are unlucky in getting an
incomplete reply while the network is small, we may not try again for a very
long time. This may contribute to difficulties bootstrapping sync.
The main change here is to only update the interval when new information is
added to the system, and not (as before) whenever a request is sent out to a
peer. The rate computation is essentially the same as before, although the code
has been a bit simplified, and I consolidated some of the error handling so
that we don't have to check in multiple places for the same conditions.
Related changes:
- Improve error diagnostics for too-soon and overflow conditions.
- Clean up state handling in the poll interval computation.
- Pin the minimum interval avert a chance of PEX spamming a peer.
* reorganizing basic concepts, adding outline to navigate easy
* Update spec/abci++/README.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* address problem with snapshot list data type
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update spec/abci++/abci++_basic_concepts_002_draft.md
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* clarify handling events in same-execution model
* remove outdated text about vote extension singing
* clarification apphash state-sync
Co-authored-by: Sergio Mena <sergio@informal.systems>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Fixes#8052 again. Ideally we would have some way of detecting that this
happens before merging, but the way we build docs right now is kind of
complicated.